org.apache.lucene.analysis.in
Class IndicTokenizer
java.lang.Object
  
org.apache.lucene.util.AttributeSource
      
org.apache.lucene.analysis.TokenStream
          
org.apache.lucene.analysis.Tokenizer
              
org.apache.lucene.analysis.CharTokenizer
                  
org.apache.lucene.analysis.in.IndicTokenizer
- All Implemented Interfaces: 
 - Closeable
 
Deprecated. (3.6) Use StandardTokenizer instead.
@Deprecated
public final class IndicTokenizer
- extends CharTokenizer
 
Simple Tokenizer for text in Indian Languages.
 
 
 
| Fields inherited from class org.apache.lucene.analysis.Tokenizer | 
input | 
 
 
| 
Method Summary | 
protected  boolean | 
isTokenChar(int c)
 
          Deprecated. Returns true iff a codepoint should be included in a token. | 
 
 
 
 
| Methods inherited from class org.apache.lucene.util.AttributeSource | 
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString | 
 
 
IndicTokenizer
public IndicTokenizer(Version matchVersion,
                      AttributeSource.AttributeFactory factory,
                      Reader input)
- Deprecated. 
 
IndicTokenizer
public IndicTokenizer(Version matchVersion,
                      AttributeSource source,
                      Reader input)
- Deprecated. 
 
IndicTokenizer
public IndicTokenizer(Version matchVersion,
                      Reader input)
- Deprecated. 
 
isTokenChar
protected boolean isTokenChar(int c)
- Deprecated. 
- Description copied from class: 
CharTokenizer 
- Returns true iff a codepoint should be included in a token. This tokenizer
 generates as tokens adjacent sequences of codepoints which satisfy this
 predicate. Codepoints for which this is false are used to define token
 boundaries and are not included in tokens.
 
 As of Lucene 3.1 the char based API (CharTokenizer.isTokenChar(char) and
 CharTokenizer.normalize(char)) has been depreciated in favor of a Unicode 4.0
 compatible int based API to support codepoints instead of UTF-16 code
 units. Subclasses of CharTokenizer must not override the char based
 methods if a Version >= 3.1 is passed to the constructor.
 
 
 NOTE: This method will be marked abstract in Lucene 4.0.
 
- Overrides:
 isTokenChar in class CharTokenizer