|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
Packages that use TokenStream | |
---|---|
org.apache.lucene.analysis | API and code to convert text into indexable/searchable tokens. |
org.apache.lucene.analysis.ar | Analyzer for Arabic. |
org.apache.lucene.analysis.bg | Analyzer for Bulgarian. |
org.apache.lucene.analysis.br | Analyzer for Brazilian Portuguese. |
org.apache.lucene.analysis.cjk | Analyzer for Chinese, Japanese, and Korean, which indexes bigrams (overlapping groups of two adjacent Han characters). |
org.apache.lucene.analysis.cn | Analyzer for Chinese, which indexes unigrams (individual chinese characters). |
org.apache.lucene.analysis.cn.smart |
Analyzer for Simplified Chinese, which indexes words. |
org.apache.lucene.analysis.compound | A filter that decomposes compound words you find in many Germanic languages into the word parts. |
org.apache.lucene.analysis.cz | Analyzer for Czech. |
org.apache.lucene.analysis.de | Analyzer for German. |
org.apache.lucene.analysis.el | Analyzer for Greek. |
org.apache.lucene.analysis.en | Analyzer for English. |
org.apache.lucene.analysis.es | Analyzer for Spanish. |
org.apache.lucene.analysis.fa | Analyzer for Persian. |
org.apache.lucene.analysis.fi | Analyzer for Finnish. |
org.apache.lucene.analysis.fr | Analyzer for French. |
org.apache.lucene.analysis.ga | Analysis for Irish. |
org.apache.lucene.analysis.gl | Analyzer for Galician. |
org.apache.lucene.analysis.hi | Analyzer for Hindi. |
org.apache.lucene.analysis.hu | Analyzer for Hungarian. |
org.apache.lucene.analysis.hunspell | Stemming TokenFilter using a Java implementation of the Hunspell stemming algorithm. |
org.apache.lucene.analysis.icu | Analysis components based on ICU |
org.apache.lucene.analysis.icu.segmentation | Tokenizer that breaks text into words with the Unicode Text Segmentation algorithm. |
org.apache.lucene.analysis.id | Analyzer for Indonesian. |
org.apache.lucene.analysis.in | Analysis components for Indian languages. |
org.apache.lucene.analysis.it | Analyzer for Italian. |
org.apache.lucene.analysis.ja | Analyzer for Japanese. |
org.apache.lucene.analysis.lv | Analyzer for Latvian. |
org.apache.lucene.analysis.miscellaneous | Miscellaneous TokenStreams |
org.apache.lucene.analysis.ngram | Character n-gram tokenizers and filters. |
org.apache.lucene.analysis.nl | Analyzer for Dutch. |
org.apache.lucene.analysis.no | Analyzer for Norwegian. |
org.apache.lucene.analysis.path | Analysis components for path-like strings such as filenames. |
org.apache.lucene.analysis.payloads | Provides various convenience classes for creating payloads on Tokens. |
org.apache.lucene.analysis.phonetic | Analysis components for phonetic search. |
org.apache.lucene.analysis.position | Filter for assigning position increments. |
org.apache.lucene.analysis.pt | Analyzer for Portuguese. |
org.apache.lucene.analysis.query | Automatically filter high-frequency stopwords. |
org.apache.lucene.analysis.reverse | Filter to reverse token text. |
org.apache.lucene.analysis.ru | Analyzer for Russian. |
org.apache.lucene.analysis.shingle | Word n-gram filters |
org.apache.lucene.analysis.snowball | TokenFilter and Analyzer implementations that use Snowball
stemmers. |
org.apache.lucene.analysis.standard | Standards-based analyzers implemented with JFlex. |
org.apache.lucene.analysis.stempel | Stempel: Algorithmic Stemmer |
org.apache.lucene.analysis.sv | Analyzer for Swedish. |
org.apache.lucene.analysis.synonym | Analysis components for Synonyms. |
org.apache.lucene.analysis.th | Analyzer for Thai. |
org.apache.lucene.analysis.tr | Analyzer for Turkish. |
org.apache.lucene.analysis.wikipedia | Tokenizer that is aware of Wikipedia syntax. |
org.apache.lucene.collation |
CollationKeyFilter
converts each token into its binary CollationKey using the
provided Collator , and then encode the CollationKey
as a String using
IndexableBinaryStringTools , to allow it to be
stored as an index term. |
org.apache.lucene.document | The logical representation of a Document for indexing and searching. |
org.apache.lucene.facet.enhancements | Enhanced category features |
org.apache.lucene.facet.enhancements.association | Association category enhancements |
org.apache.lucene.facet.index | Indexing of document categories |
org.apache.lucene.facet.index.streaming | Expert: attributes streaming definition for indexing facets |
org.apache.lucene.index.memory | High-performance single-document main memory Apache Lucene fulltext search index. |
org.apache.lucene.queryParser | A simple query parser implemented with JavaCC. |
org.apache.lucene.search.highlight | The highlight package contains classes to provide "keyword in context" features typically used to highlight search terms in the text of results pages. |
Uses of TokenStream in org.apache.lucene.analysis |
---|
Subclasses of TokenStream in org.apache.lucene.analysis | |
---|---|
class |
ASCIIFoldingFilter
This class converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if one exists. |
class |
CachingTokenFilter
This class can be used if the token attributes of a TokenStream are intended to be consumed more than once. |
class |
CannedTokenStream
TokenStream from a canned list of Tokens. |
class |
CharTokenizer
An abstract base class for simple, character-oriented tokenizers. |
class |
EmptyTokenizer
Emits no tokens |
class |
FilteringTokenFilter
Abstract base class for TokenFilters that may remove tokens. |
class |
ISOLatin1AccentFilter
Deprecated. If you build a new index, use ASCIIFoldingFilter
which covers a superset of Latin 1.
This class is included for use with existing
indexes and will be removed in a future release (possibly Lucene 4.0). |
class |
KeywordMarkerFilter
Marks terms as keywords via the KeywordAttribute . |
class |
KeywordTokenizer
Emits the entire input as a single token. |
class |
LengthFilter
Removes words that are too long or too short from the stream. |
class |
LetterTokenizer
A LetterTokenizer is a tokenizer that divides text at non-letters. |
class |
LimitTokenCountFilter
This TokenFilter limits the number of tokens while indexing. |
class |
LowerCaseFilter
Normalizes token text to lower case. |
class |
LowerCaseTokenizer
LowerCaseTokenizer performs the function of LetterTokenizer and LowerCaseFilter together. |
class |
MockFixedLengthPayloadFilter
TokenFilter that adds random fixed-length payloads. |
class |
MockTokenizer
Tokenizer for testing. |
class |
MockVariableLengthPayloadFilter
TokenFilter that adds random variable-length payloads. |
class |
NumericTokenStream
Expert: This class provides a TokenStream
for indexing numeric values that can be used by NumericRangeQuery or NumericRangeFilter . |
class |
PorterStemFilter
Transforms the token stream as per the Porter stemming algorithm. |
class |
StopFilter
Removes stop words from a token stream. |
class |
TeeSinkTokenFilter
This TokenFilter provides the ability to set aside attribute states that have already been analyzed. |
static class |
TeeSinkTokenFilter.SinkTokenStream
TokenStream output from a tee with optional filtering. |
class |
TokenFilter
A TokenFilter is a TokenStream whose input is another TokenStream. |
class |
Tokenizer
A Tokenizer is a TokenStream whose input is a Reader. |
class |
TypeTokenFilter
Removes tokens whose types appear in a set of blocked types from a token stream. |
class |
WhitespaceTokenizer
A WhitespaceTokenizer is a tokenizer that divides text at whitespace. |
Fields in org.apache.lucene.analysis declared as TokenStream | |
---|---|
protected TokenStream |
TokenFilter.input
The source of tokens for this filter. |
protected TokenStream |
ReusableAnalyzerBase.TokenStreamComponents.sink
|
Methods in org.apache.lucene.analysis that return TokenStream | |
---|---|
protected TokenStream |
ReusableAnalyzerBase.TokenStreamComponents.getTokenStream()
Returns the sink TokenStream |
TokenStream |
MockAnalyzer.reusableTokenStream(String fieldName,
Reader reader)
|
TokenStream |
ReusableAnalyzerBase.reusableTokenStream(String fieldName,
Reader reader)
This method uses ReusableAnalyzerBase.createComponents(String, Reader) to obtain an
instance of ReusableAnalyzerBase.TokenStreamComponents . |
TokenStream |
PerFieldAnalyzerWrapper.reusableTokenStream(String fieldName,
Reader reader)
|
TokenStream |
Analyzer.reusableTokenStream(String fieldName,
Reader reader)
Creates a TokenStream that is allowed to be re-used from the previous time that the same thread called this method. |
TokenStream |
LimitTokenCountAnalyzer.reusableTokenStream(String fieldName,
Reader reader)
|
TokenStream |
MockAnalyzer.tokenStream(String fieldName,
Reader reader)
|
TokenStream |
ReusableAnalyzerBase.tokenStream(String fieldName,
Reader reader)
This method uses ReusableAnalyzerBase.createComponents(String, Reader) to obtain an
instance of ReusableAnalyzerBase.TokenStreamComponents and returns the sink of the
components. |
TokenStream |
PerFieldAnalyzerWrapper.tokenStream(String fieldName,
Reader reader)
|
abstract TokenStream |
Analyzer.tokenStream(String fieldName,
Reader reader)
Creates a TokenStream which tokenizes all the text in the provided Reader. |
TokenStream |
LimitTokenCountAnalyzer.tokenStream(String fieldName,
Reader reader)
|
Methods in org.apache.lucene.analysis with parameters of type TokenStream | |
---|---|
static void |
BaseTokenStreamTestCase.assertTokenStreamContents(TokenStream ts,
String[] output)
|
static void |
BaseTokenStreamTestCase.assertTokenStreamContents(TokenStream ts,
String[] output,
int[] posIncrements)
|
static void |
BaseTokenStreamTestCase.assertTokenStreamContents(TokenStream ts,
String[] output,
int[] startOffsets,
int[] endOffsets)
|
static void |
BaseTokenStreamTestCase.assertTokenStreamContents(TokenStream ts,
String[] output,
int[] startOffsets,
int[] endOffsets,
int[] posIncrements)
|
static void |
BaseTokenStreamTestCase.assertTokenStreamContents(TokenStream ts,
String[] output,
int[] startOffsets,
int[] endOffsets,
int[] posIncrements,
int[] posLengths,
Integer finalOffset)
|
static void |
BaseTokenStreamTestCase.assertTokenStreamContents(TokenStream ts,
String[] output,
int[] startOffsets,
int[] endOffsets,
int[] posIncrements,
Integer finalOffset)
|
static void |
BaseTokenStreamTestCase.assertTokenStreamContents(TokenStream ts,
String[] output,
int[] startOffsets,
int[] endOffsets,
Integer finalOffset)
|
static void |
BaseTokenStreamTestCase.assertTokenStreamContents(TokenStream ts,
String[] output,
int[] startOffsets,
int[] endOffsets,
String[] types,
int[] posIncrements)
|
static void |
BaseTokenStreamTestCase.assertTokenStreamContents(TokenStream ts,
String[] output,
int[] startOffsets,
int[] endOffsets,
String[] types,
int[] posIncrements,
int[] posLengths,
Integer finalOffset)
|
static void |
BaseTokenStreamTestCase.assertTokenStreamContents(TokenStream ts,
String[] output,
int[] startOffsets,
int[] endOffsets,
String[] types,
int[] posIncrements,
Integer finalOffset)
|
static void |
BaseTokenStreamTestCase.assertTokenStreamContents(TokenStream ts,
String[] output,
String[] types)
|
Constructors in org.apache.lucene.analysis with parameters of type TokenStream | |
---|---|
ASCIIFoldingFilter(TokenStream input)
|
|
CachingTokenFilter(TokenStream input)
|
|
FilteringTokenFilter(boolean enablePositionIncrements,
TokenStream input)
|
|
ISOLatin1AccentFilter(TokenStream input)
Deprecated. |
|
KeywordMarkerFilter(TokenStream in,
CharArraySet keywordSet)
Create a new KeywordMarkerFilter, that marks the current token as a keyword if the tokens term buffer is contained in the given set via the KeywordAttribute . |
|
KeywordMarkerFilter(TokenStream in,
Set<?> keywordSet)
Create a new KeywordMarkerFilter, that marks the current token as a keyword if the tokens term buffer is contained in the given set via the KeywordAttribute . |
|
LengthFilter(boolean enablePositionIncrements,
TokenStream in,
int min,
int max)
Build a filter that removes words that are too long or too short from the text. |
|
LengthFilter(TokenStream in,
int min,
int max)
Deprecated. Use LengthFilter.LengthFilter(boolean, TokenStream, int, int) instead. |
|
LimitTokenCountFilter(TokenStream in,
int maxTokenCount)
Build a filter that only accepts tokens up to a maximum number. |
|
LowerCaseFilter(TokenStream in)
Deprecated. Use LowerCaseFilter.LowerCaseFilter(Version, TokenStream) instead. |
|
LowerCaseFilter(Version matchVersion,
TokenStream in)
Create a new LowerCaseFilter, that normalizes token text to lower case. |
|
MockFixedLengthPayloadFilter(Random random,
TokenStream in,
int length)
|
|
MockVariableLengthPayloadFilter(Random random,
TokenStream in)
|
|
PorterStemFilter(TokenStream in)
|
|
ReusableAnalyzerBase.TokenStreamComponents(Tokenizer source,
TokenStream result)
Creates a new ReusableAnalyzerBase.TokenStreamComponents instance. |
|
StopFilter(boolean enablePositionIncrements,
TokenStream in,
Set<?> stopWords)
Deprecated. use StopFilter.StopFilter(Version, TokenStream, Set) instead |
|
StopFilter(boolean enablePositionIncrements,
TokenStream input,
Set<?> stopWords,
boolean ignoreCase)
Deprecated. Use StopFilter.StopFilter(Version, TokenStream, Set) instead |
|
StopFilter(Version matchVersion,
TokenStream in,
Set<?> stopWords)
Constructs a filter which removes words from the input TokenStream that are named in the Set. |
|
StopFilter(Version matchVersion,
TokenStream input,
Set<?> stopWords,
boolean ignoreCase)
Deprecated. Use StopFilter.StopFilter(Version, TokenStream, Set) instead |
|
TeeSinkTokenFilter(TokenStream input)
Instantiates a new TeeSinkTokenFilter. |
|
TokenFilter(TokenStream input)
Construct a token stream filtering the given input. |
|
TokenStreamToDot(String inputText,
TokenStream in,
PrintWriter out)
If inputText is non-null, and the TokenStream has offsets, we include the surface form in each arc's label. |
|
TypeTokenFilter(boolean enablePositionIncrements,
TokenStream input,
Set<String> stopTypes)
|
|
TypeTokenFilter(boolean enablePositionIncrements,
TokenStream input,
Set<String> stopTypes,
boolean useWhiteList)
|
Uses of TokenStream in org.apache.lucene.analysis.ar |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.ar | |
---|---|
class |
ArabicLetterTokenizer
Deprecated. (3.1) Use StandardTokenizer instead. |
class |
ArabicNormalizationFilter
A TokenFilter that applies ArabicNormalizer to normalize the orthography. |
class |
ArabicStemFilter
A TokenFilter that applies ArabicStemmer to stem Arabic words.. |
Constructors in org.apache.lucene.analysis.ar with parameters of type TokenStream | |
---|---|
ArabicNormalizationFilter(TokenStream input)
|
|
ArabicStemFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.bg |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.bg | |
---|---|
class |
BulgarianStemFilter
A TokenFilter that applies BulgarianStemmer to stem Bulgarian
words. |
Constructors in org.apache.lucene.analysis.bg with parameters of type TokenStream | |
---|---|
BulgarianStemFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.br |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.br | |
---|---|
class |
BrazilianStemFilter
A TokenFilter that applies BrazilianStemmer . |
Constructors in org.apache.lucene.analysis.br with parameters of type TokenStream | |
---|---|
BrazilianStemFilter(TokenStream in)
Creates a new BrazilianStemFilter |
|
BrazilianStemFilter(TokenStream in,
Set<?> exclusiontable)
Deprecated. use KeywordAttribute with KeywordMarkerFilter instead. |
Uses of TokenStream in org.apache.lucene.analysis.cjk |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.cjk | |
---|---|
class |
CJKBigramFilter
Forms bigrams of CJK terms that are generated from StandardTokenizer or ICUTokenizer. |
class |
CJKTokenizer
Deprecated. Use StandardTokenizer, CJKWidthFilter, CJKBigramFilter, and LowerCaseFilter instead. |
class |
CJKWidthFilter
A TokenFilter that normalizes CJK width differences:
Folds fullwidth ASCII variants into the equivalent basic latin
Folds halfwidth Katakana variants into the equivalent kana
|
Constructors in org.apache.lucene.analysis.cjk with parameters of type TokenStream | |
---|---|
CJKBigramFilter(TokenStream in)
Calls CJKBigramFilter(HAN | HIRAGANA | KATAKANA | HANGUL) |
|
CJKBigramFilter(TokenStream in,
int flags)
Create a new CJKBigramFilter, specifying which writing systems should be bigrammed. |
|
CJKWidthFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.cn |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.cn | |
---|---|
class |
ChineseFilter
Deprecated. Use StopFilter instead, which has the same functionality.
This filter will be removed in Lucene 5.0 |
class |
ChineseTokenizer
Deprecated. Use StandardTokenizer instead, which has the same functionality.
This filter will be removed in Lucene 5.0 |
Constructors in org.apache.lucene.analysis.cn with parameters of type TokenStream | |
---|---|
ChineseFilter(TokenStream in)
Deprecated. |
Uses of TokenStream in org.apache.lucene.analysis.cn.smart |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.cn.smart | |
---|---|
class |
SentenceTokenizer
Tokenizes input text into sentences. |
class |
WordTokenFilter
A TokenFilter that breaks sentences into words. |
Methods in org.apache.lucene.analysis.cn.smart that return TokenStream | |
---|---|
TokenStream |
SmartChineseAnalyzer.reusableTokenStream(String fieldName,
Reader reader)
|
TokenStream |
SmartChineseAnalyzer.tokenStream(String fieldName,
Reader reader)
|
Constructors in org.apache.lucene.analysis.cn.smart with parameters of type TokenStream | |
---|---|
WordTokenFilter(TokenStream in)
Construct a new WordTokenizer. |
Uses of TokenStream in org.apache.lucene.analysis.compound |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.compound | |
---|---|
class |
CompoundWordTokenFilterBase
Base class for decomposition token filters. |
class |
DictionaryCompoundWordTokenFilter
A TokenFilter that decomposes compound words found in many Germanic languages. |
class |
HyphenationCompoundWordTokenFilter
A TokenFilter that decomposes compound words found in many Germanic languages. |
Uses of TokenStream in org.apache.lucene.analysis.cz |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.cz | |
---|---|
class |
CzechStemFilter
A TokenFilter that applies CzechStemmer to stem Czech words. |
Constructors in org.apache.lucene.analysis.cz with parameters of type TokenStream | |
---|---|
CzechStemFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.de |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.de | |
---|---|
class |
GermanLightStemFilter
A TokenFilter that applies GermanLightStemmer to stem German
words. |
class |
GermanMinimalStemFilter
A TokenFilter that applies GermanMinimalStemmer to stem German
words. |
class |
GermanNormalizationFilter
Normalizes German characters according to the heuristics of the German2 snowball algorithm. |
class |
GermanStemFilter
A TokenFilter that stems German words. |
Constructors in org.apache.lucene.analysis.de with parameters of type TokenStream | |
---|---|
GermanLightStemFilter(TokenStream input)
|
|
GermanMinimalStemFilter(TokenStream input)
|
|
GermanNormalizationFilter(TokenStream input)
|
|
GermanStemFilter(TokenStream in)
Creates a GermanStemFilter instance |
|
GermanStemFilter(TokenStream in,
Set<?> exclusionSet)
Deprecated. use KeywordAttribute with KeywordMarkerFilter instead. |
Uses of TokenStream in org.apache.lucene.analysis.el |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.el | |
---|---|
class |
GreekLowerCaseFilter
Normalizes token text to lower case, removes some Greek diacritics, and standardizes final sigma to sigma. |
class |
GreekStemFilter
A TokenFilter that applies GreekStemmer to stem Greek
words. |
Constructors in org.apache.lucene.analysis.el with parameters of type TokenStream | |
---|---|
GreekLowerCaseFilter(TokenStream in)
Deprecated. Use GreekLowerCaseFilter.GreekLowerCaseFilter(Version, TokenStream) instead. |
|
GreekLowerCaseFilter(Version matchVersion,
TokenStream in)
Create a GreekLowerCaseFilter that normalizes Greek token text. |
|
GreekStemFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.en |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.en | |
---|---|
class |
EnglishMinimalStemFilter
A TokenFilter that applies EnglishMinimalStemmer to stem
English words. |
class |
EnglishPossessiveFilter
TokenFilter that removes possessives (trailing 's) from words. |
class |
KStemFilter
A high-performance kstem filter for english. |
Constructors in org.apache.lucene.analysis.en with parameters of type TokenStream | |
---|---|
EnglishMinimalStemFilter(TokenStream input)
|
|
EnglishPossessiveFilter(TokenStream input)
Deprecated. Use EnglishPossessiveFilter.EnglishPossessiveFilter(Version, TokenStream) instead. |
|
EnglishPossessiveFilter(Version version,
TokenStream input)
|
|
KStemFilter(TokenStream in)
|
Uses of TokenStream in org.apache.lucene.analysis.es |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.es | |
---|---|
class |
SpanishLightStemFilter
A TokenFilter that applies SpanishLightStemmer to stem Spanish
words. |
Constructors in org.apache.lucene.analysis.es with parameters of type TokenStream | |
---|---|
SpanishLightStemFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.fa |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.fa | |
---|---|
class |
PersianNormalizationFilter
A TokenFilter that applies PersianNormalizer to normalize the
orthography. |
Constructors in org.apache.lucene.analysis.fa with parameters of type TokenStream | |
---|---|
PersianNormalizationFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.fi |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.fi | |
---|---|
class |
FinnishLightStemFilter
A TokenFilter that applies FinnishLightStemmer to stem Finnish
words. |
Constructors in org.apache.lucene.analysis.fi with parameters of type TokenStream | |
---|---|
FinnishLightStemFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.fr |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.fr | |
---|---|
class |
ElisionFilter
Removes elisions from a TokenStream . |
class |
FrenchLightStemFilter
A TokenFilter that applies FrenchLightStemmer to stem French
words. |
class |
FrenchMinimalStemFilter
A TokenFilter that applies FrenchMinimalStemmer to stem French
words. |
class |
FrenchStemFilter
Deprecated. Use SnowballFilter with
FrenchStemmer instead, which has the
same functionality. This filter will be removed in Lucene 5.0 |
Constructors in org.apache.lucene.analysis.fr with parameters of type TokenStream | |
---|---|
ElisionFilter(TokenStream input)
Deprecated. use ElisionFilter.ElisionFilter(Version, TokenStream) instead |
|
ElisionFilter(TokenStream input,
Set<?> articles)
Deprecated. use ElisionFilter.ElisionFilter(Version, TokenStream, Set) instead |
|
ElisionFilter(TokenStream input,
String[] articles)
Deprecated. use ElisionFilter.ElisionFilter(Version, TokenStream, Set) instead |
|
ElisionFilter(Version matchVersion,
TokenStream input)
Constructs an elision filter with standard stop words |
|
ElisionFilter(Version matchVersion,
TokenStream input,
Set<?> articles)
Constructs an elision filter with a Set of stop words |
|
FrenchLightStemFilter(TokenStream input)
|
|
FrenchMinimalStemFilter(TokenStream input)
|
|
FrenchStemFilter(TokenStream in)
Deprecated. |
|
FrenchStemFilter(TokenStream in,
Set<?> exclusiontable)
Deprecated. use KeywordAttribute with KeywordMarkerFilter instead. |
Uses of TokenStream in org.apache.lucene.analysis.ga |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.ga | |
---|---|
class |
IrishLowerCaseFilter
Normalises token text to lower case, handling t-prothesis and n-eclipsis (i.e., that 'nAthair' should become 'n-athair') |
Constructors in org.apache.lucene.analysis.ga with parameters of type TokenStream | |
---|---|
IrishLowerCaseFilter(TokenStream in)
Create an IrishLowerCaseFilter that normalises Irish token text. |
Uses of TokenStream in org.apache.lucene.analysis.gl |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.gl | |
---|---|
class |
GalicianMinimalStemFilter
A TokenFilter that applies GalicianMinimalStemmer to stem
Galician words. |
class |
GalicianStemFilter
A TokenFilter that applies GalicianStemmer to stem
Galician words. |
Constructors in org.apache.lucene.analysis.gl with parameters of type TokenStream | |
---|---|
GalicianMinimalStemFilter(TokenStream input)
|
|
GalicianStemFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.hi |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.hi | |
---|---|
class |
HindiNormalizationFilter
A TokenFilter that applies HindiNormalizer to normalize the
orthography. |
class |
HindiStemFilter
A TokenFilter that applies HindiStemmer to stem Hindi words. |
Constructors in org.apache.lucene.analysis.hi with parameters of type TokenStream | |
---|---|
HindiNormalizationFilter(TokenStream input)
|
|
HindiStemFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.hu |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.hu | |
---|---|
class |
HungarianLightStemFilter
A TokenFilter that applies HungarianLightStemmer to stem
Hungarian words. |
Constructors in org.apache.lucene.analysis.hu with parameters of type TokenStream | |
---|---|
HungarianLightStemFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.hunspell |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.hunspell | |
---|---|
class |
HunspellStemFilter
TokenFilter that uses hunspell affix rules and words to stem tokens. |
Constructors in org.apache.lucene.analysis.hunspell with parameters of type TokenStream | |
---|---|
HunspellStemFilter(TokenStream input,
HunspellDictionary dictionary)
Creates a new HunspellStemFilter that will stem tokens from the given TokenStream using affix rules in the provided HunspellDictionary |
|
HunspellStemFilter(TokenStream input,
HunspellDictionary dictionary,
boolean dedup)
Creates a new HunspellStemFilter that will stem tokens from the given TokenStream using affix rules in the provided HunspellDictionary |
Uses of TokenStream in org.apache.lucene.analysis.icu |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.icu | |
---|---|
class |
ICUFoldingFilter
A TokenFilter that applies search term folding to Unicode text, applying foldings from UTR#30 Character Foldings. |
class |
ICUNormalizer2Filter
Normalize token text with ICU's Normalizer2 |
class |
ICUTransformFilter
A TokenFilter that transforms text with ICU. |
Constructors in org.apache.lucene.analysis.icu with parameters of type TokenStream | |
---|---|
ICUFoldingFilter(TokenStream input)
Create a new ICUFoldingFilter on the specified input |
|
ICUNormalizer2Filter(TokenStream input)
Create a new Normalizer2Filter that combines NFKC normalization, Case Folding, and removes Default Ignorables (NFKC_Casefold) |
|
ICUNormalizer2Filter(TokenStream input,
com.ibm.icu.text.Normalizer2 normalizer)
Create a new Normalizer2Filter with the specified Normalizer2 |
|
ICUTransformFilter(TokenStream input,
com.ibm.icu.text.Transliterator transform)
Create a new ICUTransformFilter that transforms text on the given stream. |
Uses of TokenStream in org.apache.lucene.analysis.icu.segmentation |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.icu.segmentation | |
---|---|
class |
ICUTokenizer
Breaks text into words according to UAX #29: Unicode Text Segmentation (http://www.unicode.org/reports/tr29/) |
Uses of TokenStream in org.apache.lucene.analysis.id |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.id | |
---|---|
class |
IndonesianStemFilter
A TokenFilter that applies IndonesianStemmer to stem Indonesian words. |
Constructors in org.apache.lucene.analysis.id with parameters of type TokenStream | |
---|---|
IndonesianStemFilter(TokenStream input)
Calls IndonesianStemFilter(input, true) |
|
IndonesianStemFilter(TokenStream input,
boolean stemDerivational)
Create a new IndonesianStemFilter. |
Uses of TokenStream in org.apache.lucene.analysis.in |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.in | |
---|---|
class |
IndicNormalizationFilter
A TokenFilter that applies IndicNormalizer to normalize text
in Indian Languages. |
class |
IndicTokenizer
Deprecated. (3.6) Use StandardTokenizer instead. |
Constructors in org.apache.lucene.analysis.in with parameters of type TokenStream | |
---|---|
IndicNormalizationFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.it |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.it | |
---|---|
class |
ItalianLightStemFilter
A TokenFilter that applies ItalianLightStemmer to stem Italian
words. |
Constructors in org.apache.lucene.analysis.it with parameters of type TokenStream | |
---|---|
ItalianLightStemFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.ja |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.ja | |
---|---|
class |
JapaneseBaseFormFilter
Replaces term text with the BaseFormAttribute . |
class |
JapaneseKatakanaStemFilter
A TokenFilter that normalizes common katakana spelling variations
ending in a long sound character by removing this character (U+30FC). |
class |
JapanesePartOfSpeechStopFilter
Removes tokens that match a set of part-of-speech tags. |
class |
JapaneseReadingFormFilter
A TokenFilter that replaces the term
attribute with the reading of a token in either katakana or romaji form. |
class |
JapaneseTokenizer
Tokenizer for Japanese that uses morphological analysis. |
Constructors in org.apache.lucene.analysis.ja with parameters of type TokenStream | |
---|---|
JapaneseBaseFormFilter(TokenStream input)
|
|
JapaneseKatakanaStemFilter(TokenStream input)
|
|
JapaneseKatakanaStemFilter(TokenStream input,
int minimumLength)
|
|
JapanesePartOfSpeechStopFilter(boolean enablePositionIncrements,
TokenStream input,
Set<String> stopTags)
|
|
JapaneseReadingFormFilter(TokenStream input)
|
|
JapaneseReadingFormFilter(TokenStream input,
boolean useRomaji)
|
Uses of TokenStream in org.apache.lucene.analysis.lv |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.lv | |
---|---|
class |
LatvianStemFilter
A TokenFilter that applies LatvianStemmer to stem Latvian
words. |
Constructors in org.apache.lucene.analysis.lv with parameters of type TokenStream | |
---|---|
LatvianStemFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.miscellaneous |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.miscellaneous | |
---|---|
class |
EmptyTokenStream
An always exhausted token stream. |
class |
PrefixAndSuffixAwareTokenFilter
Links two PrefixAwareTokenFilter . |
class |
PrefixAwareTokenFilter
Joins two token streams and leaves the last token of the first stream available to be used when updating the token values in the second stream based on that token. |
class |
SingleTokenTokenStream
A TokenStream containing a single token. |
class |
StemmerOverrideFilter
Provides the ability to override any KeywordAttribute aware stemmer
with custom dictionary-based stemming. |
Methods in org.apache.lucene.analysis.miscellaneous that return TokenStream | |
---|---|
TokenStream |
PrefixAwareTokenFilter.getPrefix()
|
TokenStream |
PrefixAwareTokenFilter.getSuffix()
|
Methods in org.apache.lucene.analysis.miscellaneous with parameters of type TokenStream | |
---|---|
void |
PrefixAwareTokenFilter.setPrefix(TokenStream prefix)
|
void |
PrefixAwareTokenFilter.setSuffix(TokenStream suffix)
|
Constructors in org.apache.lucene.analysis.miscellaneous with parameters of type TokenStream | |
---|---|
PrefixAndSuffixAwareTokenFilter(TokenStream prefix,
TokenStream input,
TokenStream suffix)
|
|
PrefixAwareTokenFilter(TokenStream prefix,
TokenStream suffix)
|
|
StemmerOverrideFilter(Version matchVersion,
TokenStream input,
Map<?,String> dictionary)
Create a new StemmerOverrideFilter, performing dictionary-based stemming with the provided dictionary . |
Uses of TokenStream in org.apache.lucene.analysis.ngram |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.ngram | |
---|---|
class |
EdgeNGramTokenFilter
Tokenizes the given token into n-grams of given size(s). |
class |
EdgeNGramTokenizer
Tokenizes the input from an edge into n-grams of given size(s). |
class |
NGramTokenFilter
Tokenizes the input into n-grams of the given size(s). |
class |
NGramTokenizer
Tokenizes the input into n-grams of the given size(s). |
Constructors in org.apache.lucene.analysis.ngram with parameters of type TokenStream | |
---|---|
EdgeNGramTokenFilter(TokenStream input,
EdgeNGramTokenFilter.Side side,
int minGram,
int maxGram)
Creates EdgeNGramTokenFilter that can generate n-grams in the sizes of the given range |
|
EdgeNGramTokenFilter(TokenStream input,
String sideLabel,
int minGram,
int maxGram)
Creates EdgeNGramTokenFilter that can generate n-grams in the sizes of the given range |
|
NGramTokenFilter(TokenStream input)
Creates NGramTokenFilter with default min and max n-grams. |
|
NGramTokenFilter(TokenStream input,
int minGram,
int maxGram)
Creates NGramTokenFilter with given min and max n-grams. |
Uses of TokenStream in org.apache.lucene.analysis.nl |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.nl | |
---|---|
class |
DutchStemFilter
Deprecated. Use SnowballFilter with
DutchStemmer instead, which has the
same functionality. This filter will be removed in Lucene 5.0 |
Constructors in org.apache.lucene.analysis.nl with parameters of type TokenStream | |
---|---|
DutchStemFilter(TokenStream _in)
Deprecated. |
|
DutchStemFilter(TokenStream _in,
Map<?,?> stemdictionary)
Deprecated. |
|
DutchStemFilter(TokenStream _in,
Set<?> exclusiontable)
Deprecated. use KeywordAttribute with KeywordMarkerFilter instead. |
|
DutchStemFilter(TokenStream _in,
Set<?> exclusiontable,
Map<?,?> stemdictionary)
Deprecated. use KeywordAttribute with KeywordMarkerFilter instead. |
Uses of TokenStream in org.apache.lucene.analysis.no |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.no | |
---|---|
class |
NorwegianLightStemFilter
A TokenFilter that applies NorwegianLightStemmer to stem Norwegian
words. |
class |
NorwegianMinimalStemFilter
A TokenFilter that applies NorwegianMinimalStemmer to stem Norwegian
words. |
Constructors in org.apache.lucene.analysis.no with parameters of type TokenStream | |
---|---|
NorwegianLightStemFilter(TokenStream input)
|
|
NorwegianMinimalStemFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.path |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.path | |
---|---|
class |
PathHierarchyTokenizer
Tokenizer for path-like hierarchies. |
class |
ReversePathHierarchyTokenizer
Tokenizer for domain-like hierarchies. |
Uses of TokenStream in org.apache.lucene.analysis.payloads |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.payloads | |
---|---|
class |
DelimitedPayloadTokenFilter
Characters before the delimiter are the "token", those after are the payload. |
class |
NumericPayloadTokenFilter
Assigns a payload to a token based on the Token.type() |
class |
TokenOffsetPayloadTokenFilter
Adds the Token.setStartOffset(int)
and Token.setEndOffset(int)
First 4 bytes are the start |
class |
TypeAsPayloadTokenFilter
Makes the Token.type() a payload. |
Constructors in org.apache.lucene.analysis.payloads with parameters of type TokenStream | |
---|---|
DelimitedPayloadTokenFilter(TokenStream input,
char delimiter,
PayloadEncoder encoder)
|
|
NumericPayloadTokenFilter(TokenStream input,
float payload,
String typeMatch)
|
|
TokenOffsetPayloadTokenFilter(TokenStream input)
|
|
TypeAsPayloadTokenFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.phonetic |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.phonetic | |
---|---|
class |
BeiderMorseFilter
TokenFilter for Beider-Morse phonetic encoding. |
class |
DoubleMetaphoneFilter
Filter for DoubleMetaphone (supporting secondary codes) |
class |
PhoneticFilter
Create tokens for phonetic matches. |
Constructors in org.apache.lucene.analysis.phonetic with parameters of type TokenStream | |
---|---|
BeiderMorseFilter(TokenStream input,
org.apache.commons.codec.language.bm.PhoneticEngine engine)
Calls BeiderMorseFilter(input, engine, null) |
|
BeiderMorseFilter(TokenStream input,
org.apache.commons.codec.language.bm.PhoneticEngine engine,
org.apache.commons.codec.language.bm.Languages.LanguageSet languages)
Create a new BeiderMorseFilter |
|
DoubleMetaphoneFilter(TokenStream input,
int maxCodeLength,
boolean inject)
|
|
PhoneticFilter(TokenStream in,
org.apache.commons.codec.Encoder encoder,
boolean inject)
|
Uses of TokenStream in org.apache.lucene.analysis.position |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.position | |
---|---|
class |
PositionFilter
Set the positionIncrement of all tokens to the "positionIncrement", except the first return token which retains its original positionIncrement value. |
Constructors in org.apache.lucene.analysis.position with parameters of type TokenStream | |
---|---|
PositionFilter(TokenStream input)
Constructs a PositionFilter that assigns a position increment of zero to all but the first token from the given input stream. |
|
PositionFilter(TokenStream input,
int positionIncrement)
Constructs a PositionFilter that assigns the given position increment to all but the first token from the given input stream. |
Uses of TokenStream in org.apache.lucene.analysis.pt |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.pt | |
---|---|
class |
PortugueseLightStemFilter
A TokenFilter that applies PortugueseLightStemmer to stem
Portuguese words. |
class |
PortugueseMinimalStemFilter
A TokenFilter that applies PortugueseMinimalStemmer to stem
Portuguese words. |
class |
PortugueseStemFilter
A TokenFilter that applies PortugueseStemmer to stem
Portuguese words. |
Constructors in org.apache.lucene.analysis.pt with parameters of type TokenStream | |
---|---|
PortugueseLightStemFilter(TokenStream input)
|
|
PortugueseMinimalStemFilter(TokenStream input)
|
|
PortugueseStemFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.query |
---|
Methods in org.apache.lucene.analysis.query that return TokenStream | |
---|---|
TokenStream |
QueryAutoStopWordAnalyzer.reusableTokenStream(String fieldName,
Reader reader)
|
TokenStream |
QueryAutoStopWordAnalyzer.tokenStream(String fieldName,
Reader reader)
|
Uses of TokenStream in org.apache.lucene.analysis.reverse |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.reverse | |
---|---|
class |
ReverseStringFilter
Reverse token string, for example "country" => "yrtnuoc". |
Constructors in org.apache.lucene.analysis.reverse with parameters of type TokenStream | |
---|---|
ReverseStringFilter(TokenStream in)
Deprecated. use ReverseStringFilter.ReverseStringFilter(Version, TokenStream)
instead. This constructor will be removed in Lucene 4.0 |
|
ReverseStringFilter(TokenStream in,
char marker)
Deprecated. use ReverseStringFilter.ReverseStringFilter(Version, TokenStream, char)
instead. This constructor will be removed in Lucene 4.0 |
|
ReverseStringFilter(Version matchVersion,
TokenStream in)
Create a new ReverseStringFilter that reverses all tokens in the supplied TokenStream . |
|
ReverseStringFilter(Version matchVersion,
TokenStream in,
char marker)
Create a new ReverseStringFilter that reverses and marks all tokens in the supplied TokenStream . |
Uses of TokenStream in org.apache.lucene.analysis.ru |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.ru | |
---|---|
class |
RussianLetterTokenizer
Deprecated. Use StandardTokenizer instead, which has the same functionality.
This filter will be removed in Lucene 5.0 |
class |
RussianLightStemFilter
A TokenFilter that applies RussianLightStemmer to stem Russian
words. |
class |
RussianLowerCaseFilter
Deprecated. Use LowerCaseFilter instead, which has the same
functionality. This filter will be removed in Lucene 4.0 |
class |
RussianStemFilter
Deprecated. Use SnowballFilter with
RussianStemmer instead, which has the
same functionality. This filter will be removed in Lucene 4.0 |
Constructors in org.apache.lucene.analysis.ru with parameters of type TokenStream | |
---|---|
RussianLightStemFilter(TokenStream input)
|
|
RussianLowerCaseFilter(TokenStream in)
Deprecated. |
|
RussianStemFilter(TokenStream in)
Deprecated. |
Uses of TokenStream in org.apache.lucene.analysis.shingle |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.shingle | |
---|---|
class |
ShingleFilter
A ShingleFilter constructs shingles (token n-grams) from a token stream. |
class |
ShingleMatrixFilter
Deprecated. Will be removed in Lucene 4.0. This filter is unmaintained and might not behave correctly if used with custom Attributes, i.e. Attributes other than the ones located in org.apache.lucene.analysis.tokenattributes . It also uses
hardcoded payload encoders which makes it not easily adaptable to other use-cases. |
Methods in org.apache.lucene.analysis.shingle that return TokenStream | |
---|---|
TokenStream |
ShingleAnalyzerWrapper.reusableTokenStream(String fieldName,
Reader reader)
|
TokenStream |
ShingleAnalyzerWrapper.tokenStream(String fieldName,
Reader reader)
|
Constructors in org.apache.lucene.analysis.shingle with parameters of type TokenStream | |
---|---|
ShingleFilter(TokenStream input)
Construct a ShingleFilter with default shingle size: 2. |
|
ShingleFilter(TokenStream input,
int maxShingleSize)
Constructs a ShingleFilter with the specified shingle size from the TokenStream input |
|
ShingleFilter(TokenStream input,
int minShingleSize,
int maxShingleSize)
Constructs a ShingleFilter with the specified shingle size from the TokenStream input |
|
ShingleFilter(TokenStream input,
String tokenType)
Construct a ShingleFilter with the specified token type for shingle tokens and the default shingle size: 2 |
|
ShingleMatrixFilter(TokenStream input,
int minimumShingleSize,
int maximumShingleSize)
Deprecated. Creates a shingle filter using default settings. |
|
ShingleMatrixFilter(TokenStream input,
int minimumShingleSize,
int maximumShingleSize,
Character spacerCharacter)
Deprecated. Creates a shingle filter using default settings. |
|
ShingleMatrixFilter(TokenStream input,
int minimumShingleSize,
int maximumShingleSize,
Character spacerCharacter,
boolean ignoringSinglePrefixOrSuffixShingle)
Deprecated. Creates a shingle filter using the default ShingleMatrixFilter.TokenSettingsCodec . |
|
ShingleMatrixFilter(TokenStream input,
int minimumShingleSize,
int maximumShingleSize,
Character spacerCharacter,
boolean ignoringSinglePrefixOrSuffixShingle,
ShingleMatrixFilter.TokenSettingsCodec settingsCodec)
Deprecated. Creates a shingle filter with ad hoc parameter settings. |
Uses of TokenStream in org.apache.lucene.analysis.snowball |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.snowball | |
---|---|
class |
SnowballFilter
A filter that stems words using a Snowball-generated stemmer. |
Methods in org.apache.lucene.analysis.snowball that return TokenStream | |
---|---|
TokenStream |
SnowballAnalyzer.reusableTokenStream(String fieldName,
Reader reader)
Deprecated. Returns a (possibly reused) StandardTokenizer filtered by a
StandardFilter , a LowerCaseFilter ,
a StopFilter , and a SnowballFilter |
TokenStream |
SnowballAnalyzer.tokenStream(String fieldName,
Reader reader)
Deprecated. Constructs a StandardTokenizer filtered by a StandardFilter , a LowerCaseFilter , a StopFilter ,
and a SnowballFilter |
Constructors in org.apache.lucene.analysis.snowball with parameters of type TokenStream | |
---|---|
SnowballFilter(TokenStream input,
SnowballProgram stemmer)
|
|
SnowballFilter(TokenStream in,
String name)
Construct the named stemming filter. |
Uses of TokenStream in org.apache.lucene.analysis.standard |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.standard | |
---|---|
class |
ClassicFilter
Normalizes tokens extracted with ClassicTokenizer . |
class |
ClassicTokenizer
A grammar-based tokenizer constructed with JFlex |
class |
StandardFilter
Normalizes tokens extracted with StandardTokenizer . |
class |
StandardTokenizer
A grammar-based tokenizer constructed with JFlex. |
class |
UAX29URLEmailTokenizer
This class implements Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29 URLs and email addresses are also tokenized according to the relevant RFCs. |
Constructors in org.apache.lucene.analysis.standard with parameters of type TokenStream | |
---|---|
ClassicFilter(TokenStream in)
Construct filtering in. |
|
StandardFilter(TokenStream in)
Deprecated. Use StandardFilter.StandardFilter(Version, TokenStream) instead. |
|
StandardFilter(Version matchVersion,
TokenStream in)
|
Uses of TokenStream in org.apache.lucene.analysis.stempel |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.stempel | |
---|---|
class |
StempelFilter
Transforms the token stream as per the stemming algorithm. |
Constructors in org.apache.lucene.analysis.stempel with parameters of type TokenStream | |
---|---|
StempelFilter(TokenStream in,
StempelStemmer stemmer)
Create filter using the supplied stemming table. |
|
StempelFilter(TokenStream in,
StempelStemmer stemmer,
int minLength)
Create filter using the supplied stemming table. |
Uses of TokenStream in org.apache.lucene.analysis.sv |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.sv | |
---|---|
class |
SwedishLightStemFilter
A TokenFilter that applies SwedishLightStemmer to stem Swedish
words. |
Constructors in org.apache.lucene.analysis.sv with parameters of type TokenStream | |
---|---|
SwedishLightStemFilter(TokenStream input)
|
Uses of TokenStream in org.apache.lucene.analysis.synonym |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.synonym | |
---|---|
class |
SynonymFilter
Matches single or multi word synonyms in a token stream. |
Constructors in org.apache.lucene.analysis.synonym with parameters of type TokenStream | |
---|---|
SynonymFilter(TokenStream input,
SynonymMap synonyms,
boolean ignoreCase)
|
Uses of TokenStream in org.apache.lucene.analysis.th |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.th | |
---|---|
class |
ThaiWordFilter
TokenFilter that use BreakIterator to break each
Token that is Thai into separate Token(s) for each Thai word. |
Constructors in org.apache.lucene.analysis.th with parameters of type TokenStream | |
---|---|
ThaiWordFilter(TokenStream input)
Deprecated. Use the ctor with matchVersion instead! |
|
ThaiWordFilter(Version matchVersion,
TokenStream input)
Creates a new ThaiWordFilter with the specified match version. |
Uses of TokenStream in org.apache.lucene.analysis.tr |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.tr | |
---|---|
class |
TurkishLowerCaseFilter
Normalizes Turkish token text to lower case. |
Constructors in org.apache.lucene.analysis.tr with parameters of type TokenStream | |
---|---|
TurkishLowerCaseFilter(TokenStream in)
Create a new TurkishLowerCaseFilter, that normalizes Turkish token text to lower case. |
Uses of TokenStream in org.apache.lucene.analysis.wikipedia |
---|
Subclasses of TokenStream in org.apache.lucene.analysis.wikipedia | |
---|---|
class |
WikipediaTokenizer
Extension of StandardTokenizer that is aware of Wikipedia syntax. |
Uses of TokenStream in org.apache.lucene.collation |
---|
Subclasses of TokenStream in org.apache.lucene.collation | |
---|---|
class |
CollationKeyFilter
Converts each token into its CollationKey , and then
encodes the CollationKey with IndexableBinaryStringTools , to allow
it to be stored as an index term. |
class |
ICUCollationKeyFilter
Converts each token into its CollationKey , and
then encodes the CollationKey with IndexableBinaryStringTools , to
allow it to be stored as an index term. |
Methods in org.apache.lucene.collation that return TokenStream | |
---|---|
TokenStream |
ICUCollationKeyAnalyzer.reusableTokenStream(String fieldName,
Reader reader)
|
TokenStream |
CollationKeyAnalyzer.reusableTokenStream(String fieldName,
Reader reader)
|
TokenStream |
ICUCollationKeyAnalyzer.tokenStream(String fieldName,
Reader reader)
|
TokenStream |
CollationKeyAnalyzer.tokenStream(String fieldName,
Reader reader)
|
Constructors in org.apache.lucene.collation with parameters of type TokenStream | |
---|---|
CollationKeyFilter(TokenStream input,
Collator collator)
|
|
ICUCollationKeyFilter(TokenStream input,
com.ibm.icu.text.Collator collator)
|
Uses of TokenStream in org.apache.lucene.document |
---|
Fields in org.apache.lucene.document declared as TokenStream | |
---|---|
protected TokenStream |
AbstractField.tokenStream
|
Methods in org.apache.lucene.document that return TokenStream | |
---|---|
TokenStream |
NumericField.tokenStreamValue()
Returns a NumericTokenStream for indexing the numeric value. |
TokenStream |
Field.tokenStreamValue()
The TokesStream for this field to be used when indexing, or null. |
TokenStream |
Fieldable.tokenStreamValue()
The TokenStream for this field to be used when indexing, or null. |
Methods in org.apache.lucene.document with parameters of type TokenStream | |
---|---|
void |
Field.setTokenStream(TokenStream tokenStream)
Expert: sets the token stream to be used for indexing and causes isIndexed() and isTokenized() to return true. |
Constructors in org.apache.lucene.document with parameters of type TokenStream | |
---|---|
Field(String name,
TokenStream tokenStream)
Create a tokenized and indexed field that is not stored. |
|
Field(String name,
TokenStream tokenStream,
Field.TermVector termVector)
Create a tokenized and indexed field that is not stored, optionally with storing term vectors. |
Uses of TokenStream in org.apache.lucene.facet.enhancements |
---|
Subclasses of TokenStream in org.apache.lucene.facet.enhancements | |
---|---|
class |
EnhancementsCategoryTokenizer
A tokenizer which adds to each category token payload according to the CategoryEnhancement s defined in the given
EnhancementsIndexingParams . |
Methods in org.apache.lucene.facet.enhancements that return TokenStream | |
---|---|
protected TokenStream |
EnhancementsDocumentBuilder.getParentsStream(CategoryAttributesStream categoryAttributesStream)
|
Methods in org.apache.lucene.facet.enhancements with parameters of type TokenStream | |
---|---|
protected CategoryListTokenizer |
EnhancementsDocumentBuilder.getCategoryListTokenizer(TokenStream categoryStream)
|
CategoryListTokenizer |
CategoryEnhancement.getCategoryListTokenizer(TokenStream tokenizer,
EnhancementsIndexingParams indexingParams,
TaxonomyWriter taxonomyWriter)
Get the CategoryListTokenizer which generates the category list for
this enhancement. |
protected CategoryTokenizer |
EnhancementsDocumentBuilder.getCategoryTokenizer(TokenStream categoryStream)
|
Constructors in org.apache.lucene.facet.enhancements with parameters of type TokenStream | |
---|---|
EnhancementsCategoryTokenizer(TokenStream input,
EnhancementsIndexingParams indexingParams)
Constructor. |
Uses of TokenStream in org.apache.lucene.facet.enhancements.association |
---|
Subclasses of TokenStream in org.apache.lucene.facet.enhancements.association | |
---|---|
class |
AssociationListTokenizer
Tokenizer for associations of a category |
Methods in org.apache.lucene.facet.enhancements.association with parameters of type TokenStream | |
---|---|
CategoryListTokenizer |
AssociationEnhancement.getCategoryListTokenizer(TokenStream tokenizer,
EnhancementsIndexingParams indexingParams,
TaxonomyWriter taxonomyWriter)
|
Constructors in org.apache.lucene.facet.enhancements.association with parameters of type TokenStream | |
---|---|
AssociationListTokenizer(TokenStream input,
EnhancementsIndexingParams indexingParams,
CategoryEnhancement enhancement)
|
Uses of TokenStream in org.apache.lucene.facet.index |
---|
Methods in org.apache.lucene.facet.index that return TokenStream | |
---|---|
protected TokenStream |
CategoryDocumentBuilder.getParentsStream(CategoryAttributesStream categoryAttributesStream)
Get a stream of categories which includes the parents, according to policies defined in indexing parameters. |
Methods in org.apache.lucene.facet.index with parameters of type TokenStream | |
---|---|
protected CategoryListTokenizer |
CategoryDocumentBuilder.getCategoryListTokenizer(TokenStream categoryStream)
Get a category list tokenizer (or a series of such tokenizers) to create the category list tokens. |
protected CategoryTokenizer |
CategoryDocumentBuilder.getCategoryTokenizer(TokenStream categoryStream)
Get a CategoryTokenizer to create the category tokens. |
protected CountingListTokenizer |
CategoryDocumentBuilder.getCountingListTokenizer(TokenStream categoryStream)
Get a CountingListTokenizer for creating counting list token. |
Uses of TokenStream in org.apache.lucene.facet.index.streaming |
---|
Subclasses of TokenStream in org.apache.lucene.facet.index.streaming | |
---|---|
class |
CategoryAttributesStream
An attribute stream built from an Iterable of
CategoryAttribute . |
class |
CategoryListTokenizer
A base class for category list tokenizers, which add category list tokens to category streams. |
class |
CategoryParentsStream
This class adds parents to a CategoryAttributesStream . |
class |
CategoryTokenizer
Basic class for setting the CharTermAttribute s and
PayloadAttribute s of category tokens. |
class |
CategoryTokenizerBase
A base class for all token filters which add term and payload attributes to tokens and are to be used in CategoryDocumentBuilder . |
class |
CountingListTokenizer
CategoryListTokenizer for facet counting |
Constructors in org.apache.lucene.facet.index.streaming with parameters of type TokenStream | |
---|---|
CategoryListTokenizer(TokenStream input,
FacetIndexingParams indexingParams)
|
|
CategoryTokenizer(TokenStream input,
FacetIndexingParams indexingParams)
|
|
CategoryTokenizerBase(TokenStream input,
FacetIndexingParams indexingParams)
Constructor. |
|
CountingListTokenizer(TokenStream input,
FacetIndexingParams indexingParams)
|
Uses of TokenStream in org.apache.lucene.index.memory |
---|
Methods in org.apache.lucene.index.memory that return TokenStream | ||
---|---|---|
|
MemoryIndex.keywordTokenStream(Collection<T> keywords)
Convenience method; Creates and returns a token stream that generates a token for each keyword in the given collection, "as is", without any transforming text analysis. |
Methods in org.apache.lucene.index.memory with parameters of type TokenStream | |
---|---|
void |
MemoryIndex.addField(String fieldName,
TokenStream stream)
Equivalent to addField(fieldName, stream, 1.0f) . |
void |
MemoryIndex.addField(String fieldName,
TokenStream stream,
float boost)
Iterates over the given token stream and adds the resulting terms to the index; Equivalent to adding a tokenized, indexed, termVectorStored, unstored, Lucene Field . |
Uses of TokenStream in org.apache.lucene.queryParser |
---|
Subclasses of TokenStream in org.apache.lucene.queryParser | |
---|---|
static class |
QueryParserTestBase.QPTestFilter
Filter which discards the token 'stop' and which expands the token 'phrase' into 'phrase1 phrase2' |
Methods in org.apache.lucene.queryParser that return TokenStream | |
---|---|
TokenStream |
QueryParserTestBase.QPTestAnalyzer.tokenStream(String fieldName,
Reader reader)
|
Constructors in org.apache.lucene.queryParser with parameters of type TokenStream | |
---|---|
QueryParserTestBase.QPTestFilter(TokenStream in)
|
Uses of TokenStream in org.apache.lucene.search.highlight |
---|
Subclasses of TokenStream in org.apache.lucene.search.highlight | |
---|---|
class |
OffsetLimitTokenFilter
This TokenFilter limits the number of tokens while indexing by adding up the current offset. |
class |
TokenStreamFromTermPositionVector
|
Methods in org.apache.lucene.search.highlight that return TokenStream | |
---|---|
static TokenStream |
TokenSources.getAnyTokenStream(IndexReader reader,
int docId,
String field,
Analyzer analyzer)
A convenience method that tries a number of approaches to getting a token stream. |
static TokenStream |
TokenSources.getAnyTokenStream(IndexReader reader,
int docId,
String field,
Document doc,
Analyzer analyzer)
A convenience method that tries to first get a TermPositionVector for the specified docId, then, falls back to using the passed in Document to retrieve the TokenStream. |
TokenStream |
WeightedSpanTermExtractor.getTokenStream()
|
static TokenStream |
TokenSources.getTokenStream(Document doc,
String field,
Analyzer analyzer)
|
static TokenStream |
TokenSources.getTokenStream(IndexReader reader,
int docId,
String field)
|
static TokenStream |
TokenSources.getTokenStream(IndexReader reader,
int docId,
String field,
Analyzer analyzer)
|
static TokenStream |
TokenSources.getTokenStream(String field,
String contents,
Analyzer analyzer)
|
static TokenStream |
TokenSources.getTokenStream(TermPositionVector tpv)
|
static TokenStream |
TokenSources.getTokenStream(TermPositionVector tpv,
boolean tokenPositionsGuaranteedContiguous)
Low level api. |
TokenStream |
QueryScorer.init(TokenStream tokenStream)
|
TokenStream |
QueryTermScorer.init(TokenStream tokenStream)
|
TokenStream |
Scorer.init(TokenStream tokenStream)
Called to init the Scorer with a TokenStream . |
Methods in org.apache.lucene.search.highlight with parameters of type TokenStream | |
---|---|
String |
Highlighter.getBestFragment(TokenStream tokenStream,
String text)
Highlights chosen terms in a text, extracting the most relevant section. |
String[] |
Highlighter.getBestFragments(TokenStream tokenStream,
String text,
int maxNumFragments)
Highlights chosen terms in a text, extracting the most relevant sections. |
String |
Highlighter.getBestFragments(TokenStream tokenStream,
String text,
int maxNumFragments,
String separator)
Highlights terms in the text , extracting the most relevant sections and concatenating the chosen fragments with a separator (typically "..."). |
TextFragment[] |
Highlighter.getBestTextFragments(TokenStream tokenStream,
String text,
boolean mergeContiguousFragments,
int maxNumFragments)
Low level api to get the most relevant (formatted) sections of the document. |
Map<String,WeightedSpanTerm> |
WeightedSpanTermExtractor.getWeightedSpanTerms(Query query,
TokenStream tokenStream)
Creates a Map of WeightedSpanTerms from the given Query and TokenStream . |
Map<String,WeightedSpanTerm> |
WeightedSpanTermExtractor.getWeightedSpanTerms(Query query,
TokenStream tokenStream,
String fieldName)
Creates a Map of WeightedSpanTerms from the given Query and TokenStream . |
Map<String,WeightedSpanTerm> |
WeightedSpanTermExtractor.getWeightedSpanTermsWithScores(Query query,
TokenStream tokenStream,
String fieldName,
IndexReader reader)
Creates a Map of WeightedSpanTerms from the given Query and TokenStream . |
TokenStream |
QueryScorer.init(TokenStream tokenStream)
|
TokenStream |
QueryTermScorer.init(TokenStream tokenStream)
|
TokenStream |
Scorer.init(TokenStream tokenStream)
Called to init the Scorer with a TokenStream . |
void |
NullFragmenter.start(String s,
TokenStream tokenStream)
|
void |
Fragmenter.start(String originalText,
TokenStream tokenStream)
Initializes the Fragmenter. |
void |
SimpleSpanFragmenter.start(String originalText,
TokenStream tokenStream)
|
void |
SimpleFragmenter.start(String originalText,
TokenStream stream)
|
Constructors in org.apache.lucene.search.highlight with parameters of type TokenStream | |
---|---|
OffsetLimitTokenFilter(TokenStream input,
int offsetLimit)
|
|
TokenGroup(TokenStream tokenStream)
|
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |