org.apache.lucene.analysis.ar
Class ArabicLetterTokenizer
java.lang.Object
  
org.apache.lucene.util.AttributeSource
      
org.apache.lucene.analysis.TokenStream
          
org.apache.lucene.analysis.Tokenizer
              
org.apache.lucene.analysis.util.CharTokenizer
                  
org.apache.lucene.analysis.core.LetterTokenizer
                      
org.apache.lucene.analysis.ar.ArabicLetterTokenizer
- All Implemented Interfaces: 
 - Closeable
 
Deprecated. (3.1) Use StandardTokenizer instead.
@Deprecated
public class ArabicLetterTokenizer
- extends LetterTokenizer
 
Tokenizer that breaks text into runs of letters and diacritics.
 
 The problem with the standard Letter tokenizer is that it fails on diacritics.
 Handling similar to this is necessary for Indic Scripts, Hebrew, Thaana, etc.
 
 
 
 You must specify the required Version compatibility when creating
 ArabicLetterTokenizer:
 
 
 
 
| Fields inherited from class org.apache.lucene.analysis.Tokenizer | 
input | 
 
 
| 
Method Summary | 
protected  boolean | 
isTokenChar(int c)
 
          Deprecated. Allows for Letter category or NonspacingMark category | 
 
 
 
| Methods inherited from class org.apache.lucene.util.AttributeSource | 
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState | 
 
 
ArabicLetterTokenizer
public ArabicLetterTokenizer(Version matchVersion,
                             Reader in)
- Deprecated. 
- Construct a new ArabicLetterTokenizer.
- Parameters:
 matchVersion - Lucene version
 to match See abovein - the input to split up into tokens
  
ArabicLetterTokenizer
public ArabicLetterTokenizer(Version matchVersion,
                             AttributeSource source,
                             Reader in)
- Deprecated. 
- Construct a new ArabicLetterTokenizer using a given 
AttributeSource.
- Parameters:
 matchVersion - Lucene version to match See abovesource - the attribute source to use for this Tokenizerin - the input to split up into tokens
  
ArabicLetterTokenizer
public ArabicLetterTokenizer(Version matchVersion,
                             AttributeSource.AttributeFactory factory,
                             Reader in)
- Deprecated. 
- Construct a new ArabicLetterTokenizer using a given
 
AttributeSource.AttributeFactory. * @param
 matchVersion Lucene version to match See
 above
- Parameters:
 factory - the attribute factory to use for this Tokenizerin - the input to split up into tokens
  
isTokenChar
protected boolean isTokenChar(int c)
- Deprecated. 
- Allows for Letter category or NonspacingMark category
- Overrides:
 isTokenChar in class LetterTokenizer
 
- See Also:
 LetterTokenizer.isTokenChar(int)
 
  
          Copyright © 2000-2012 Apache Software Foundation.  All Rights Reserved.