|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.lucene.analysis.standard.std31.UAX29URLEmailTokenizerImpl31
@Deprecated public final class UAX29URLEmailTokenizerImpl31
This class implements UAX29URLEmailTokenizer, except with a bug (https://issues.apache.org/jira/browse/LUCENE-3358) where Han and Hiragana characters would be split from combining characters:
Field Summary | |
---|---|
static int |
EMAIL_TYPE
Deprecated. |
static int |
HANGUL_TYPE
Deprecated. |
static int |
HIRAGANA_TYPE
Deprecated. |
static int |
IDEOGRAPHIC_TYPE
Deprecated. |
static int |
KATAKANA_TYPE
Deprecated. |
static int |
NUMERIC_TYPE
Deprecated. Numbers |
static int |
SOUTH_EAST_ASIAN_TYPE
Deprecated. Chars in class \p{Line_Break = Complex_Context} are from South East Asian scripts (Thai, Lao, Myanmar, Khmer, etc.). |
static int |
URL_TYPE
Deprecated. |
static int |
WORD_TYPE
Deprecated. Alphanumeric sequences |
static int |
YYEOF
Deprecated. This character denotes the end of file |
static int |
YYINITIAL
Deprecated. lexical states |
Constructor Summary | |
---|---|
UAX29URLEmailTokenizerImpl31(InputStream in)
Deprecated. Creates a new scanner. |
|
UAX29URLEmailTokenizerImpl31(Reader in)
Deprecated. Creates a new scanner There is also a java.io.InputStream version of this constructor. |
Method Summary | |
---|---|
int |
getNextToken()
Deprecated. Resumes scanning until the next regular expression is matched, the end of input is encountered or an I/O-Error occurs. |
void |
getText(CharTermAttribute t)
Deprecated. Fills CharTermAttribute with the current token text. |
void |
yybegin(int newState)
Deprecated. Enters a new lexical state |
int |
yychar()
Deprecated. Returns the current position. |
char |
yycharat(int pos)
Deprecated. Returns the character at position pos from the matched text. |
void |
yyclose()
Deprecated. Closes the input stream. |
int |
yylength()
Deprecated. Returns the length of the matched text region. |
void |
yypushback(int number)
Deprecated. Pushes the specified amount of characters back into the input stream. |
void |
yyreset(Reader reader)
Deprecated. Resets the scanner to read from a new input stream. |
int |
yystate()
Deprecated. Returns the current lexical state. |
String |
yytext()
Deprecated. Returns the text matched by the current regular expression. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final int YYEOF
public static final int YYINITIAL
public static final int WORD_TYPE
public static final int NUMERIC_TYPE
public static final int SOUTH_EAST_ASIAN_TYPE
See Unicode Line Breaking Algorithm: http://www.unicode.org/reports/tr14/#SA
public static final int IDEOGRAPHIC_TYPE
public static final int HIRAGANA_TYPE
public static final int KATAKANA_TYPE
public static final int HANGUL_TYPE
public static final int EMAIL_TYPE
public static final int URL_TYPE
Constructor Detail |
---|
public UAX29URLEmailTokenizerImpl31(Reader in)
in
- the java.io.Reader to read input from.public UAX29URLEmailTokenizerImpl31(InputStream in)
in
- the java.io.Inputstream to read input from.Method Detail |
---|
public final int yychar()
StandardTokenizerInterface
yychar
in interface StandardTokenizerInterface
public final void getText(CharTermAttribute t)
getText
in interface StandardTokenizerInterface
public final void yyclose() throws IOException
IOException
public final void yyreset(Reader reader)
yyreset
in interface StandardTokenizerInterface
reader
- the new input streampublic final int yystate()
public final void yybegin(int newState)
newState
- the new lexical statepublic final String yytext()
public final char yycharat(int pos)
pos
- the position of the character to fetch.
A value from 0 to yylength()-1.
public final int yylength()
yylength
in interface StandardTokenizerInterface
public void yypushback(int number)
number
- the number of characters to be read again.
This number must not be greater than yylength()!public int getNextToken() throws IOException
getNextToken
in interface StandardTokenizerInterface
IOException
- if any I/O-Error occurs
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |