|
|||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | ||||||||
See:
Description
| Class Summary | |
|---|---|
| ASCIIFoldingFilter | This class converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if one exists. |
| ASCIIFoldingFilterFactory | Factory for ASCIIFoldingFilter. |
| CapitalizationFilter | A filter to apply normal capitalization rules to Tokens. |
| CapitalizationFilterFactory | Factory for CapitalizationFilter. |
| EmptyTokenStream | An always exhausted token stream. |
| HyphenatedWordsFilter | When the plain text is extracted from documents, we will often have many words hyphenated and broken into two lines. |
| HyphenatedWordsFilterFactory | Factory for HyphenatedWordsFilter. |
| KeepWordFilter | A TokenFilter that only keeps tokens with text contained in the required words. |
| KeepWordFilterFactory | Factory for KeepWordFilter. |
| KeywordMarkerFilter | Marks terms as keywords via the KeywordAttribute. |
| KeywordMarkerFilterFactory | Factory for KeywordMarkerFilter. |
| LengthFilter | Removes words that are too long or too short from the stream. |
| LengthFilterFactory | Factory for LengthFilter. |
| LimitTokenCountAnalyzer | This Analyzer limits the number of tokens while indexing. |
| LimitTokenCountFilter | This TokenFilter limits the number of tokens while indexing. |
| LimitTokenCountFilterFactory | Factory for LimitTokenCountFilter. |
| PatternAnalyzer | Deprecated. (4.0) use the pattern-based analysis in the analysis/pattern package instead. |
| PerFieldAnalyzerWrapper | This analyzer is used to facilitate scenarios where different fields require different analysis techniques. |
| PrefixAndSuffixAwareTokenFilter | Links two PrefixAwareTokenFilter. |
| PrefixAwareTokenFilter | Joins two token streams and leaves the last token of the first stream available to be used when updating the token values in the second stream based on that token. |
| RemoveDuplicatesTokenFilter | A TokenFilter which filters out Tokens at the same position and Term text as the previous token in the stream. |
| RemoveDuplicatesTokenFilterFactory | Factory for RemoveDuplicatesTokenFilter. |
| SingleTokenTokenStream | A TokenStream containing a single token. |
| StemmerOverrideFilter | Provides the ability to override any KeywordAttribute aware stemmer
with custom dictionary-based stemming. |
| StemmerOverrideFilterFactory | Factory for StemmerOverrideFilter. |
| TrimFilter | Trims leading and trailing whitespace from Tokens in the stream. |
| TrimFilterFactory | Factory for TrimFilter. |
| WordDelimiterFilter | Splits words into subwords and performs optional transformations on subword groups. |
| WordDelimiterFilterFactory | Factory for WordDelimiterFilter. |
| WordDelimiterIterator | A BreakIterator-like API for iterating over subwords in text, according to WordDelimiterFilter rules. |
Miscellaneous TokenStreams
|
|||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | ||||||||