org.apache.lucene.analysis
Class BaseCharFilter
java.lang.Object
java.io.Reader
org.apache.lucene.analysis.CharStream
org.apache.lucene.analysis.CharFilter
org.apache.lucene.analysis.BaseCharFilter
- All Implemented Interfaces:
- Closeable, Readable
- Direct Known Subclasses:
- HTMLStripCharFilter, MappingCharFilter
public abstract class BaseCharFilter
- extends CharFilter
Base utility class for implementing a CharFilter
.
You subclass this, and then record mappings by calling
addOffCorrectMap(int, int)
, and then invoke the correct
method to correct an offset.
+
+ CharFilters modify an input stream via a series of substring
+ replacements (including deletions and insertions) to produce an output
+ stream. There are three possible replacement cases: the replacement
+ string has the same length as the original substring; the replacement
+ is shorter; and the replacement is longer. In the latter two cases
+ (when the replacement has a different length than the original),
+ one or more offset correction mappings are required.
+
+
+ When the replacement is shorter than the original (e.g. when the
+ replacement is the empty string), a single offset correction mapping
+ should be added at the replacement's end offset in the output stream.
+ The cumulativeDiff
parameter to the
+ addOffCorrectMapping()
method will be the sum of all
+ previous replacement offset adjustments, with the addition of the
+ difference between the lengths of the original substring and the
+ replacement string (a positive value).
+
+
+ When the replacement is longer than the original (e.g. when the
+ original is the empty string), you should add as many offset
+ correction mappings as the difference between the lengths of the
+ replacement string and the original substring, starting at the
+ end offset the original substring would have had in the output stream.
+ The cumulativeDiff
parameter to the
+ addOffCorrectMapping()
method will be the sum of all
+ previous replacement offset adjustments, with the addition of the
+ difference between the lengths of the original substring and the
+ replacement string so far (a negative value).
+
Method Summary |
protected void |
addOffCorrectMap(int off,
int cumulativeDiff)
Adds an offset correction mapping at the given output stream offset. |
protected int |
correct(int currentOff)
Retrieve the corrected offset. |
protected int |
getLastCumulativeDiff()
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
BaseCharFilter
public BaseCharFilter(CharStream in)
correct
protected int correct(int currentOff)
- Retrieve the corrected offset.
- Overrides:
correct
in class CharFilter
- Parameters:
currentOff
- current offset
- Returns:
- corrected offset
getLastCumulativeDiff
protected int getLastCumulativeDiff()
addOffCorrectMap
protected void addOffCorrectMap(int off,
int cumulativeDiff)
Adds an offset correction mapping at the given output stream offset.
Assumption: the offset given with each successive call to this method
will not be smaller than the offset given at the previous invocation.
- Parameters:
off
- The output stream offset at which to apply the correctioncumulativeDiff
- The input offset is given by adding this
to the output offset