org.apache.lucene.index
Class TermVectorMapper

java.lang.Object
  extended by org.apache.lucene.index.TermVectorMapper
Direct Known Subclasses:
FieldSortedTermVectorMapper, PositionBasedTermVectorMapper, SortedTermVectorMapper

public abstract class TermVectorMapper
extends Object

The TermVectorMapper can be used to map Term Vectors into your own structure instead of the parallel array structure used by IndexReader.getTermFreqVector(int,String).

It is up to the implementation to make sure it is thread-safe.


Constructor Summary
protected TermVectorMapper()
           
protected TermVectorMapper(boolean ignoringPositions, boolean ignoringOffsets)
           
 
Method Summary
 boolean isIgnoringOffsets()
           
 boolean isIgnoringPositions()
          Indicate to Lucene that even if there are positions stored, this mapper is not interested in them and they can be skipped over.
abstract  void map(String term, int frequency, TermVectorOffsetInfo[] offsets, int[] positions)
          Map the Term Vector information into your own structure
 void setDocumentNumber(int documentNumber)
          Passes down the index of the document whose term vector is currently being mapped, once for each top level call to a term vector reader.
abstract  void setExpectations(String field, int numTerms, boolean storeOffsets, boolean storePositions)
          Tell the mapper what to expect in regards to field, number of terms, offset and position storage.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TermVectorMapper

protected TermVectorMapper()

TermVectorMapper

protected TermVectorMapper(boolean ignoringPositions,
                           boolean ignoringOffsets)
Parameters:
ignoringPositions - true if this mapper should tell Lucene to ignore positions even if they are stored
ignoringOffsets - similar to ignoringPositions
Method Detail

setExpectations

public abstract void setExpectations(String field,
                                     int numTerms,
                                     boolean storeOffsets,
                                     boolean storePositions)
Tell the mapper what to expect in regards to field, number of terms, offset and position storage. This method will be called once before retrieving the vector for a field. This method will be called before map(String,int,TermVectorOffsetInfo[],int[]).

Parameters:
field - The field the vector is for
numTerms - The number of terms that need to be mapped
storeOffsets - true if the mapper should expect offset information
storePositions - true if the mapper should expect positions info

map

public abstract void map(String term,
                         int frequency,
                         TermVectorOffsetInfo[] offsets,
                         int[] positions)
Map the Term Vector information into your own structure

Parameters:
term - The term to add to the vector
frequency - The frequency of the term in the document
offsets - null if the offset is not specified, otherwise the offset into the field of the term
positions - null if the position is not specified, otherwise the position in the field of the term

isIgnoringPositions

public boolean isIgnoringPositions()
Indicate to Lucene that even if there are positions stored, this mapper is not interested in them and they can be skipped over. Derived classes should set this to true if they want to ignore positions. The default is false, meaning positions will be loaded if they are stored.

Returns:
false

isIgnoringOffsets

public boolean isIgnoringOffsets()
Returns:
false
See Also:
Same principal as {@link #isIgnoringPositions()}, but applied to offsets. false by default.

setDocumentNumber

public void setDocumentNumber(int documentNumber)
Passes down the index of the document whose term vector is currently being mapped, once for each top level call to a term vector reader.

Default implementation IGNORES the document number. Override if your implementation needs the document number.

NOTE: Document numbers are internal to Lucene and subject to change depending on indexing operations.

Parameters:
documentNumber - index of document currently being mapped