| 
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectorg.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.Tokenizer
org.apache.lucene.analysis.ngram.EdgeNGramTokenizer
public final class EdgeNGramTokenizer
Tokenizes the input from an edge into n-grams of given size(s).
 This Tokenizer create n-grams from the beginning edge or ending edge of a input token.
 MaxGram can't be larger than 1024 because of limitation.
 
| Nested Class Summary | |
|---|---|
static class | 
EdgeNGramTokenizer.Side
Specifies which side of the input the n-gram should be generated from  | 
| Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource | 
|---|
AttributeSource.AttributeFactory, AttributeSource.State | 
| Field Summary | |
|---|---|
static int | 
DEFAULT_MAX_GRAM_SIZE
 | 
static int | 
DEFAULT_MIN_GRAM_SIZE
 | 
static EdgeNGramTokenizer.Side | 
DEFAULT_SIDE
 | 
| Fields inherited from class org.apache.lucene.analysis.Tokenizer | 
|---|
input | 
| Constructor Summary | |
|---|---|
EdgeNGramTokenizer(AttributeSource.AttributeFactory factory,
                   Reader input,
                   EdgeNGramTokenizer.Side side,
                   int minGram,
                   int maxGram)
Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range  | 
|
EdgeNGramTokenizer(AttributeSource.AttributeFactory factory,
                   Reader input,
                   String sideLabel,
                   int minGram,
                   int maxGram)
Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range  | 
|
EdgeNGramTokenizer(AttributeSource source,
                   Reader input,
                   EdgeNGramTokenizer.Side side,
                   int minGram,
                   int maxGram)
Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range  | 
|
EdgeNGramTokenizer(AttributeSource source,
                   Reader input,
                   String sideLabel,
                   int minGram,
                   int maxGram)
Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range  | 
|
EdgeNGramTokenizer(Reader input,
                   EdgeNGramTokenizer.Side side,
                   int minGram,
                   int maxGram)
Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range  | 
|
EdgeNGramTokenizer(Reader input,
                   String sideLabel,
                   int minGram,
                   int maxGram)
Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range  | 
|
| Method Summary | |
|---|---|
 void | 
end()
 | 
 boolean | 
incrementToken()
Returns the next token in the stream, or null at EOS.  | 
 void | 
reset()
 | 
| Methods inherited from class org.apache.lucene.analysis.Tokenizer | 
|---|
close, correctOffset, setReader | 
| Methods inherited from class org.apache.lucene.util.AttributeSource | 
|---|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState | 
| Methods inherited from class java.lang.Object | 
|---|
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait | 
| Field Detail | 
|---|
public static final EdgeNGramTokenizer.Side DEFAULT_SIDE
public static final int DEFAULT_MAX_GRAM_SIZE
public static final int DEFAULT_MIN_GRAM_SIZE
| Constructor Detail | 
|---|
public EdgeNGramTokenizer(Reader input,
                          EdgeNGramTokenizer.Side side,
                          int minGram,
                          int maxGram)
input - Reader holding the input to be tokenizedside - the EdgeNGramTokenizer.Side from which to chop off an n-gramminGram - the smallest n-gram to generatemaxGram - the largest n-gram to generate
public EdgeNGramTokenizer(AttributeSource source,
                          Reader input,
                          EdgeNGramTokenizer.Side side,
                          int minGram,
                          int maxGram)
source - AttributeSource to useinput - Reader holding the input to be tokenizedside - the EdgeNGramTokenizer.Side from which to chop off an n-gramminGram - the smallest n-gram to generatemaxGram - the largest n-gram to generate
public EdgeNGramTokenizer(AttributeSource.AttributeFactory factory,
                          Reader input,
                          EdgeNGramTokenizer.Side side,
                          int minGram,
                          int maxGram)
factory - AttributeSource.AttributeFactory to useinput - Reader holding the input to be tokenizedside - the EdgeNGramTokenizer.Side from which to chop off an n-gramminGram - the smallest n-gram to generatemaxGram - the largest n-gram to generate
public EdgeNGramTokenizer(Reader input,
                          String sideLabel,
                          int minGram,
                          int maxGram)
input - Reader holding the input to be tokenizedsideLabel - the name of the EdgeNGramTokenizer.Side from which to chop off an n-gramminGram - the smallest n-gram to generatemaxGram - the largest n-gram to generate
public EdgeNGramTokenizer(AttributeSource source,
                          Reader input,
                          String sideLabel,
                          int minGram,
                          int maxGram)
source - AttributeSource to useinput - Reader holding the input to be tokenizedsideLabel - the name of the EdgeNGramTokenizer.Side from which to chop off an n-gramminGram - the smallest n-gram to generatemaxGram - the largest n-gram to generate
public EdgeNGramTokenizer(AttributeSource.AttributeFactory factory,
                          Reader input,
                          String sideLabel,
                          int minGram,
                          int maxGram)
factory - AttributeSource.AttributeFactory to useinput - Reader holding the input to be tokenizedsideLabel - the name of the EdgeNGramTokenizer.Side from which to chop off an n-gramminGram - the smallest n-gram to generatemaxGram - the largest n-gram to generate| Method Detail | 
|---|
public boolean incrementToken()
                       throws IOException
incrementToken in class TokenStreamIOExceptionpublic void end()
end in class TokenStream
public void reset()
           throws IOException
reset in class TokenStreamIOException
  | 
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||