java – org.apache.lucene.store.LockObtainFailedException:Lock获取超时:

我试图索引从tomcat服务器获取的大量日志文件.我编写了代码来打开每个文件,为每一行创建一个索引,然后使用Apache lucene存储每一行​​.所有这些都是使用多线程完成的.

当我尝试这个代码时,我得到了这个异常

org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:

  if (indexWriter.getConfig().getOpenMode() == IndexWriterConfig.OpenMode.CREATE)
        {
          // New index, so we just add the document (no old document can be there):
           System.out.println("adding " + path);

                indexWriter.addDocument(doc);

       } else {
          // Existing index (an old copy of this document may have been indexed) so 
       // we use updateDocument instead to replace the old one matching the exact 
           // path, if present:
            System.out.println("updating " + path);

                indexWriter.updateDocument(new Term("path", path), doc);

          }
        indexWriter.commit();
        indexWriter.close();

现在我想,因为我每次提交索引,都可能导致写锁定.所以我删除了indexWriter.commit();:

if (indexWriter.getConfig().getOpenMode() == IndexWriterConfig.OpenMode.CREATE)
    {
      // New index, so we just add the document (no old document can be there):
       System.out.println("adding " + path);

            indexWriter.addDocument(doc);

   } else {
      // Existing index (an old copy of this document may have been indexed) so 
   // we use updateDocument instead to replace the old one matching the exact 
       // path, if present:
        System.out.println("updating " + path);

            indexWriter.updateDocument(new Term("path", path), doc);

      }

    indexWriter.close();

现在我也不例外

问:所以我的问题是为什么indexWriter.commit();导致异常.即使我删除indexWriter.commit();我在搜索时没有遇到任何问题.那就是我得到了我想要的确切结果.那么为什么要使用indexWriter.commit(); ?

最佳答案 简而言之,它类似于数据库提交,除非您提交事务,添加到Solr的文档只是保存在内存中.只有在提交时,文档才会保留在索引中.

如果文档在内存中时Solr崩溃,您可能会丢失这些文档.

Explanation: –

One of the principles in Lucene since day one is the write-once
policy. We never write a file twice. When you add a document via
IndexWriter it gets indexed into the memory and once we have reached a
certain threshold (max buffered documents or RAM buffer size) we write
all the documents from the main memory to disk; you can find out more
about this here and here. Writing documents to disk produces an entire
new index called a segment. Now, when you index a bunch of documents
or you run incremental indexing in production here you can see the
number of segments changing frequently. However, once you call commit
Lucene flushes its entire RAM buffer into segments, syncs them and
writes pointers to all segments belonging to this commit into the
SEGMENTS file
.

如果文档已存在于Solr中,则它将被覆盖(由唯一ID确定).
因此,您的搜索仍然可以正常工作,但除非您提交,否则最新文档不可用于搜索.

此外,一旦打开和索引编写器,它将获得索引锁定,您应该关闭编写器以释放锁定.

点赞