org.apache.lucene.store
Class NRTCachingDirectory

java.lang.Object
  extended by org.apache.lucene.store.Directory
      extended by org.apache.lucene.store.NRTCachingDirectory
All Implemented Interfaces:
Closeable

public class NRTCachingDirectory
extends Directory

Wraps a RAMDirectory around any provided delegate directory, to be used during NRT search. Make sure you pull the merge scheduler using getMergeScheduler() and pass that to your IndexWriter; this class uses that to keep track of which merges are being done by which threads, to decide when to cache each written file.

This class is likely only useful in a near-real-time context, where indexing rate is lowish but reopen rate is highish, resulting in many tiny files being written. This directory keeps such segments (as well as the segments produced by merging them, as long as they are small enough), in RAM.

This is safe to use: when your app calls {IndexWriter#commit}, all cached files will be flushed from the cached and sync'd.

NOTE: this class is somewhat sneaky in its approach for spying on merges to determine the size of a merge: it records which threads are running which merges by watching ConcurrentMergeScheduler's doMerge method. While this works correctly, likely future versions of this class will take a more general approach.

Here's a simple example usage:

   Directory fsDir = FSDirectory.open(new File("/path/to/index"));
   NRTCachingDirectory cachedFSDir = new NRTCachingDirectory(fsDir, 5.0, 60.0);
   IndexWriterConfig conf = new IndexWriterConfig(Version.LUCENE_32, analyzer);
   conf.setMergeScheduler(cachedFSDir.getMergeScheduler());
   IndexWriter writer = new IndexWriter(cachedFSDir, conf);
 

This will cache all newly flushed segments, all merges whose expected segment size is <= 5 MB, unless the net cached bytes exceeds 60 MB at which point all writes will not be cached (until the net bytes falls below 60 MB).

WARNING: This API is experimental and might change in incompatible ways in the next release.

Field Summary
 
Fields inherited from class org.apache.lucene.store.Directory
isOpen, lockFactory
 
Constructor Summary
NRTCachingDirectory(Directory delegate, double maxMergeSizeMB, double maxCachedMB)
          We will cache a newly created output if 1) it's a flush or a merge and the estimated size of the merged segment is <= maxMergeSizeMB, and 2) the total cached bytes is <= maxCachedMB
 
Method Summary
 void clearLock(String name)
          Attempt to clear (forcefully unlock and remove) the specified lock.
 void close()
          Close this directory, which flushes any cached files to the delegate and then closes the delegate.
 IndexOutput createOutput(String name)
          Creates a new, empty file in the directory with the given name.
 void deleteFile(String name)
          Removes an existing file in the directory.
protected  boolean doCacheWrite(String name)
          Subclass can override this to customize logic; return true if this file should be written to the RAMDirectory.
 boolean fileExists(String name)
          Returns true iff a file with the given name exists.
 long fileLength(String name)
          Returns the length of a file in the directory.
 long fileModified(String name)
          Returns the time the named file was last modified.
 LockFactory getLockFactory()
          Get the LockFactory that this Directory instance is using for its locking implementation.
 String getLockID()
          Return a string identifier that uniquely differentiates this Directory instance from other Directory instances.
 MergeScheduler getMergeScheduler()
           
 String[] listAll()
          Returns an array of strings, one for each file in the directory.
 String[] listCachedFiles()
           
 Lock makeLock(String name)
          Construct a Lock.
 IndexInput openInput(String name)
          Returns a stream reading an existing file.
 IndexInput openInput(String name, int bufferSize)
          Returns a stream reading an existing file, with the specified read buffer size.
 void setLockFactory(LockFactory lf)
          Set the LockFactory that this Directory instance should use for its locking implementation.
 long sizeInBytes()
          Returns how many bytes are being used by the RAMDirectory cache
 void sync(Collection<String> fileNames)
          Ensure that any writes to these files are moved to stable storage.
 String toString()
           
 void touchFile(String name)
          Deprecated. 
 
Methods inherited from class org.apache.lucene.store.Directory
copy, copy, ensureOpen, sync
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

NRTCachingDirectory

public NRTCachingDirectory(Directory delegate,
                           double maxMergeSizeMB,
                           double maxCachedMB)
We will cache a newly created output if 1) it's a flush or a merge and the estimated size of the merged segment is <= maxMergeSizeMB, and 2) the total cached bytes is <= maxCachedMB

Method Detail

getLockFactory

public LockFactory getLockFactory()
Description copied from class: Directory
Get the LockFactory that this Directory instance is using for its locking implementation. Note that this may be null for Directory implementations that provide their own locking implementation.

Overrides:
getLockFactory in class Directory

setLockFactory

public void setLockFactory(LockFactory lf)
                    throws IOException
Description copied from class: Directory
Set the LockFactory that this Directory instance should use for its locking implementation. Each * instance of LockFactory should only be used for one directory (ie, do not share a single instance across multiple Directories).

Overrides:
setLockFactory in class Directory
Parameters:
lf - instance of LockFactory.
Throws:
IOException

getLockID

public String getLockID()
Description copied from class: Directory
Return a string identifier that uniquely differentiates this Directory instance from other Directory instances. This ID should be the same if two Directory instances (even in different JVMs and/or on different machines) are considered "the same index". This is how locking "scopes" to the right index.

Overrides:
getLockID in class Directory

makeLock

public Lock makeLock(String name)
Description copied from class: Directory
Construct a Lock.

Overrides:
makeLock in class Directory
Parameters:
name - the name of the lock file

clearLock

public void clearLock(String name)
               throws IOException
Description copied from class: Directory
Attempt to clear (forcefully unlock and remove) the specified lock. Only call this at a time when you are certain this lock is no longer in use.

Overrides:
clearLock in class Directory
Parameters:
name - name of the lock to be cleared.
Throws:
IOException

toString

public String toString()
Overrides:
toString in class Directory

listAll

public String[] listAll()
                 throws IOException
Description copied from class: Directory
Returns an array of strings, one for each file in the directory.

Specified by:
listAll in class Directory
Throws:
NoSuchDirectoryException - if the directory is not prepared for any write operations (such as Directory.createOutput(String)).
IOException - in case of other IO errors

sizeInBytes

public long sizeInBytes()
Returns how many bytes are being used by the RAMDirectory cache


fileExists

public boolean fileExists(String name)
                   throws IOException
Description copied from class: Directory
Returns true iff a file with the given name exists.

Specified by:
fileExists in class Directory
Throws:
IOException

fileModified

public long fileModified(String name)
                  throws IOException
Description copied from class: Directory
Returns the time the named file was last modified.

Specified by:
fileModified in class Directory
Throws:
IOException

touchFile

@Deprecated
public void touchFile(String name)
               throws IOException
Deprecated. 

Description copied from class: Directory
Set the modified time of an existing file to now.

Specified by:
touchFile in class Directory
Throws:
IOException

deleteFile

public void deleteFile(String name)
                throws IOException
Description copied from class: Directory
Removes an existing file in the directory.

Specified by:
deleteFile in class Directory
Throws:
IOException

fileLength

public long fileLength(String name)
                throws IOException
Description copied from class: Directory
Returns the length of a file in the directory. This method follows the following contract:

Specified by:
fileLength in class Directory
Parameters:
name - the name of the file for which to return the length.
Throws:
FileNotFoundException - if the file does not exist.
IOException - if there was an IO error while retrieving the file's length.

listCachedFiles

public String[] listCachedFiles()

createOutput

public IndexOutput createOutput(String name)
                         throws IOException
Description copied from class: Directory
Creates a new, empty file in the directory with the given name. Returns a stream writing this file.

Specified by:
createOutput in class Directory
Throws:
IOException

sync

public void sync(Collection<String> fileNames)
          throws IOException
Description copied from class: Directory
Ensure that any writes to these files are moved to stable storage. Lucene uses this to properly commit changes to the index, to prevent a machine/OS crash from corrupting the index.

NOTE: Clients may call this method for same files over and over again, so some impls might optimize for that. For other impls the operation can be a noop, for various reasons.

Overrides:
sync in class Directory
Throws:
IOException

openInput

public IndexInput openInput(String name)
                     throws IOException
Description copied from class: Directory
Returns a stream reading an existing file.

Specified by:
openInput in class Directory
Throws:
IOException

openInput

public IndexInput openInput(String name,
                            int bufferSize)
                     throws IOException
Description copied from class: Directory
Returns a stream reading an existing file, with the specified read buffer size. The particular Directory implementation may ignore the buffer size. Currently the only Directory implementations that respect this parameter are FSDirectory and CompoundFileReader.

Overrides:
openInput in class Directory
Throws:
IOException

close

public void close()
           throws IOException
Close this directory, which flushes any cached files to the delegate and then closes the delegate.

Specified by:
close in interface Closeable
Specified by:
close in class Directory
Throws:
IOException

getMergeScheduler

public MergeScheduler getMergeScheduler()

doCacheWrite

protected boolean doCacheWrite(String name)
Subclass can override this to customize logic; return true if this file should be written to the RAMDirectory.