Uses of Interface org.apache.lucene.analysis.standard.StandardTokenizerInterface (Lucene 3.6.0 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV NEXT

FRAMES NO FRAMES

Uses of Interface
org.apache.lucene.analysis.standard.StandardTokenizerInterface

Packages that use StandardTokenizerInterface
org.apache.lucene.analysis.standard	Standards-based analyzers implemented with JFlex.
org.apache.lucene.analysis.standard.std31	Backwards-compatible implementation to match `Version.LUCENE_31`
org.apache.lucene.analysis.standard.std34	Backwards-compatible implementation to match `Version.LUCENE_34`

Uses of StandardTokenizerInterface in org.apache.lucene.analysis.standard

Classes in org.apache.lucene.analysis.standard that implement StandardTokenizerInterface
`class`	`StandardTokenizerImpl` This class implements Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29 Tokens produced are of the following types: <ALPHANUM>: A sequence of alphabetic and numeric characters <NUM>: A number <SOUTHEAST_ASIAN>: A sequence of characters from South and Southeast Asian languages, including Thai, Lao, Myanmar, and Khmer <IDEOGRAPHIC>: A single CJKV ideographic character <HIRAGANA>: A single hiragana character
`class`	`UAX29URLEmailTokenizerImpl` This class implements Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29 URLs and email addresses are also tokenized according to the relevant RFCs.

Classes in org.apache.lucene.analysis.standard that implement StandardTokenizerInterface

class

StandardTokenizerImpl
This class implements Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29

Tokens produced are of the following types: <ALPHANUM>: A sequence of alphabetic and numeric characters <NUM>: A number <SOUTHEAST_ASIAN>: A sequence of characters from South and Southeast Asian languages, including Thai, Lao, Myanmar, and Khmer <IDEOGRAPHIC>: A single CJKV ideographic character <HIRAGANA>: A single hiragana character

class UAX29URLEmailTokenizerImpl
This class implements Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29 URLs and email addresses are also tokenized according to the relevant RFCs.

Uses of StandardTokenizerInterface in org.apache.lucene.analysis.standard.std31

Classes in org.apache.lucene.analysis.standard.std31 that implement StandardTokenizerInterface
`class`	`StandardTokenizerImpl31` Deprecated. This class is only for exact backwards compatibility
`class`	`UAX29URLEmailTokenizerImpl31` Deprecated. This class is only for exact backwards compatibility

Uses of StandardTokenizerInterface in org.apache.lucene.analysis.standard.std34

Classes in org.apache.lucene.analysis.standard.std34 that implement StandardTokenizerInterface
`class`	`UAX29URLEmailTokenizerImpl34` Deprecated. This class is only for exact backwards compatibility