Lucene日志建立_JAVA_程序员俱乐部

您所在的位置：程序员俱乐部 > 编程开发 > JAVA > Lucene日志建立

Lucene日志建立

2010/12/25 0:04:42 footman265 http://footman265.javaeye.com 我要评论(0)

摘要：lucene可以自己建立操作日志,刚在源码中发现,给个我刚建的日志文件:IFD[WedDec2222:08:20CST2010;main]:setInfoStreamdeletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@15dfd77IW0[WedDec2222:08:20CST2010;main]:setInfoStream:dir=org.apache.lucene.store
标签：建立

?? lucene 可以自己建立操作日志,刚在源码中发现,给个我刚建的日志文件:

IFD [Wed Dec 22 22:08:20 CST 2010; main]: setInfoStream deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@15dfd77
IW 0 [Wed Dec 22 22:08:20 CST 2010; main]: setInfoStream: dir=org.apache.lucene.store.SimpleFSDirectory@G:\package\lucene_test_dir lockFactory=org.apache.lucene.store.NativeFSLockFactory@1027b4d mergePolicy=org.apache.lucene.index.LogByteSizeMergePolicy@c55e36 mergeScheduler=org.apache.lucene.index.ConcurrentMergeScheduler@1ac3c08 ramBufferSizeMB=16.0 maxBufferedDocs=-1 maxBuffereDeleteTerms=-1 maxFieldLength=10000 index=
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
maxFieldLength 10000 reached for field contents, ignoring following tokens
IW 0 [Wed Dec 22 22:08:23 CST 2010; main]: optimize: index now 
IW 0 [Wed Dec 22 22:08:23 CST 2010; main]: flush: now pause all indexing threads
IW 0 [Wed Dec 22 22:08:23 CST 2010; main]:   flush: segment=_0 docStoreSegment=_0 docStoreOffset=0 flushDocs=true flushDeletes=true flushDocStores=false numDocs=104 numBufDelTerms=0
IW 0 [Wed Dec 22 22:08:23 CST 2010; main]:   index before flush 
IW 0 [Wed Dec 22 22:08:23 CST 2010; main]: DW: flush postings as segment _0 numDocs=104
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: DW:   oldRAMSize=2619392 newFlushedSize=1740286 docs/MB=62.663 new/old=66.439%
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: flushedFiles=[_0.nrm, _0.tis, _0.fnm, _0.tii, _0.frq, _0.prx]
IFD [Wed Dec 22 22:08:24 CST 2010; main]: now checkpoint "segments_1" [1 segments ; isCommit = false]
IFD [Wed Dec 22 22:08:24 CST 2010; main]: now checkpoint "segments_1" [1 segments ; isCommit = false]
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: LMP: findMerges: 1 segments
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: LMP:   level 6.2247195 to 6.2380013: 1 segments
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS: now merge
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS:   index: _0:C104->_0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS:   no more merges pending; now return
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS: now merge
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS:   index: _0:C104->_0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS:   no more merges pending; now return
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now flush at close
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: flush: now pause all indexing threads
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]:   flush: segment=null docStoreSegment=_0 docStoreOffset=104 flushDocs=false flushDeletes=true flushDocStores=true numDocs=0 numBufDelTerms=0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]:   index before flush _0:C104->_0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]:   flush shared docStore segment _0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: flushDocStores segment=_0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: closeDocStores segment=_0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: DW: closeDocStore: 2 files to flush to segment _0 numDocs=104
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: flushDocStores files=[_0.fdt, _0.fdx]
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS: now merge
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS:   index: _0:C104->_0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: CMS:   no more merges pending; now return
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now call final commit()
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: startCommit(): start sizeInBytes=0
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: startCommit index=_0:C104->_0 changeCount=3
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now sync _0.nrm
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now sync _0.tis
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now sync _0.fnm
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now sync _0.tii
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now sync _0.frq
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now sync _0.fdx
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now sync _0.prx
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: now sync _0.fdt
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: done all syncs
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: commit: pendingCommit != null
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: commit: wrote segments file "segments_2"
IFD [Wed Dec 22 22:08:24 CST 2010; main]: now checkpoint "segments_2" [1 segments ; isCommit = true]
IFD [Wed Dec 22 22:08:24 CST 2010; main]: deleteCommits: now decRef commit "segments_1"
IFD [Wed Dec 22 22:08:24 CST 2010; main]: delete "segments_1"
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: commit: done
IW 0 [Wed Dec 22 22:08:24 CST 2010; main]: at close: _0:C104->_0

?接下来是我的建立索引的类,代码大多借鉴lucene自带的demo

indexer类用来建立索引:

package my.firstest.copy;

import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.PrintStream;
import java.util.Date;

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.util.Version;

public class Indexer {
	private static File INDEX_DIR = new File("G:/package/lucene_test_dir");
	private static final File docDir = new File("G:/package/lucene_test_docs");

	public static void main(String[] args) throws Exception {
		if (!docDir.exists() || !docDir.canRead()) {
			System.out.println("索引的文件不存在!");
			System.exit(1);
		}
		int fileCount=INDEX_DIR.list().length;
		if(fileCount!=0){
			System.out.println("The old files is existed, begin to delete these files");
			File[] files=INDEX_DIR.listFiles();
			for(int i=0;i<fileCount;i++){
				files[i].delete();
				System.out.println("File "+files[i].getAbsolutePath()+"is deleted!");
			}
		}
		Date start = new Date();
		IndexWriter writer = new IndexWriter(FSDirectory.open(INDEX_DIR),
				new StandardAnalyzer(Version.LUCENE_CURRENT), true,
				IndexWriter.MaxFieldLength.LIMITED);
		writer.setUseCompoundFile(false);
		//writer.setMergeFactor(2);
		writer.setInfoStream(new PrintStream(new File("G:/package/lucene_test_log/log.txt")));
	    System.out.println("MergeFactor -> "+writer.getMergeFactor());
	    System.out.println("maxMergeDocs -> "+writer.getMergeFactor());
		indexDocs(writer, docDir);
		writer.optimize();
		writer.close();
		Date end = new Date();
		System.out.println("takes "+(end.getTime() - start.getTime())
				+ "milliseconds");
	}

	protected static void indexDocs(IndexWriter writer, File file)
			throws IOException {
		if (file.canRead()) {
			if (file.isDirectory()) {
				String[] files = file.list();
				if (files != null) {
					for (int i = 0; i < files.length; i++) {
						indexDocs(writer, new File(file, files[i]));
					}
				}
			} else {
				System.out.println("adding " + file);
				try {
					writer.addDocument(FileDocument.Document(file));
				} catch (FileNotFoundException fnfe) {
					;
				}
			}
		}
	}

}

?FileDocument:

package my.firstest.copy;

import java.io.File;
import java.io.FileReader;
import org.apache.lucene.document.DateTools;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;

public class FileDocument {

	public static Document Document(File f)
			throws java.io.FileNotFoundException {

		Document doc = new Document();
		doc.add(new Field("path", f.getPath(), Field.Store.YES,
				Field.Index.NOT_ANALYZED));
		doc.add(new Field("modified", DateTools.timeToString(f.lastModified(),
				DateTools.Resolution.MINUTE), Field.Store.YES,
				Field.Index.NOT_ANALYZED));
		doc.add(new Field("contents", new FileReader(f)));
		return doc;
	}
	private FileDocument() {
	}
}

关键就是writer.setInfoStream(new PrintStream(new File("G:/package/lucene_test_log/log.txt")));

在lucene的代码里,很多地方多充斥着类似:

 if (infoStream != null) {
          message("init: hit exception on init; releasing write lock");
 }

?者个message方法时:

public void message(String message) {
    if (infoStream != null)
      infoStream.println("IW " + messageID + " [" + new Date() + "; " + Thread.currentThread().getName() + "]: " + message);
  }

?这里的infoStream是IndexWriter的一个属性: