org.apache.lucene.analysis
Class BaseCharFilter

java.lang.Object
  extended by java.io.Reader
      extended by org.apache.lucene.analysis.CharStream
          extended by org.apache.lucene.analysis.CharFilter
              extended by org.apache.lucene.analysis.BaseCharFilter
All Implemented Interfaces:
Closeable, Readable
Direct Known Subclasses:
HTMLStripCharFilter, MappingCharFilter

public abstract class BaseCharFilter
extends CharFilter

Base utility class for implementing a CharFilter. You subclass this, and then record mappings by calling addOffCorrectMap(int, int), and then invoke the correct method to correct an offset.

+

+ CharFilters modify an input stream via a series of substring + replacements (including deletions and insertions) to produce an output + stream. There are three possible replacement cases: the replacement + string has the same length as the original substring; the replacement + is shorter; and the replacement is longer. In the latter two cases + (when the replacement has a different length than the original), + one or more offset correction mappings are required. +

+

+ When the replacement is shorter than the original (e.g. when the + replacement is the empty string), a single offset correction mapping + should be added at the replacement's end offset in the output stream. + The cumulativeDiff parameter to the + addOffCorrectMapping() method will be the sum of all + previous replacement offset adjustments, with the addition of the + difference between the lengths of the original substring and the + replacement string (a positive value). +

+

+ When the replacement is longer than the original (e.g. when the + original is the empty string), you should add as many offset + correction mappings as the difference between the lengths of the + replacement string and the original substring, starting at the + end offset the original substring would have had in the output stream. + The cumulativeDiff parameter to the + addOffCorrectMapping() method will be the sum of all + previous replacement offset adjustments, with the addition of the + difference between the lengths of the original substring and the + replacement string so far (a negative value). +


Field Summary
 
Fields inherited from class org.apache.lucene.analysis.CharFilter
input
 
Fields inherited from class java.io.Reader
lock
 
Constructor Summary
BaseCharFilter(CharStream in)
           
 
Method Summary
protected  void addOffCorrectMap(int off, int cumulativeDiff)
           Adds an offset correction mapping at the given output stream offset.
protected  int correct(int currentOff)
          Retrieve the corrected offset.
protected  int getLastCumulativeDiff()
           
 
Methods inherited from class org.apache.lucene.analysis.CharFilter
close, correctOffset, mark, markSupported, read, reset
 
Methods inherited from class java.io.Reader
read, read, read, ready, skip
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BaseCharFilter

public BaseCharFilter(CharStream in)
Method Detail

correct

protected int correct(int currentOff)
Retrieve the corrected offset.

Overrides:
correct in class CharFilter
Parameters:
currentOff - current offset
Returns:
corrected offset

getLastCumulativeDiff

protected int getLastCumulativeDiff()

addOffCorrectMap

protected void addOffCorrectMap(int off,
                                int cumulativeDiff)

Adds an offset correction mapping at the given output stream offset.

Assumption: the offset given with each successive call to this method will not be smaller than the offset given at the previous invocation.

Parameters:
off - The output stream offset at which to apply the correction
cumulativeDiff - The input offset is given by adding this to the output offset