org.apache.lucene.analysis
Class StopwordAnalyzerBase

java.lang.Object
  extended by org.apache.lucene.analysis.Analyzer
      extended by org.apache.lucene.analysis.ReusableAnalyzerBase
          extended by org.apache.lucene.analysis.StopwordAnalyzerBase
All Implemented Interfaces:
Closeable
Direct Known Subclasses:
ArabicAnalyzer, ArmenianAnalyzer, BasqueAnalyzer, BrazilianAnalyzer, BulgarianAnalyzer, CatalanAnalyzer, CJKAnalyzer, ClassicAnalyzer, DanishAnalyzer, EnglishAnalyzer, FinnishAnalyzer, FrenchAnalyzer, GalicianAnalyzer, GermanAnalyzer, GreekAnalyzer, HindiAnalyzer, HungarianAnalyzer, IndonesianAnalyzer, IrishAnalyzer, ItalianAnalyzer, JapaneseAnalyzer, LatvianAnalyzer, NorwegianAnalyzer, PersianAnalyzer, PolishAnalyzer, PortugueseAnalyzer, RomanianAnalyzer, RussianAnalyzer, SpanishAnalyzer, StandardAnalyzer, StopAnalyzer, SwedishAnalyzer, ThaiAnalyzer, TurkishAnalyzer, UAX29URLEmailAnalyzer

public abstract class StopwordAnalyzerBase
extends ReusableAnalyzerBase

Base class for Analyzers that need to make use of stopword sets.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.lucene.analysis.ReusableAnalyzerBase
ReusableAnalyzerBase.TokenStreamComponents
 
Field Summary
protected  Version matchVersion
           
protected  CharArraySet stopwords
          An immutable stopword set
 
Constructor Summary
protected StopwordAnalyzerBase(Version version)
          Creates a new Analyzer with an empty stopword set
protected StopwordAnalyzerBase(Version version, Set<?> stopwords)
          Creates a new instance initialized with the given stopword set
 
Method Summary
 Set<?> getStopwordSet()
          Returns the analyzer's stopword set or an empty set if the analyzer has no stopwords
protected static CharArraySet loadStopwordSet(boolean ignoreCase, Class<? extends ReusableAnalyzerBase> aClass, String resource, String comment)
          Creates a CharArraySet from a file resource associated with a class.
protected static CharArraySet loadStopwordSet(File stopwords, Version matchVersion)
          Creates a CharArraySet from a file.
protected static CharArraySet loadStopwordSet(Reader stopwords, Version matchVersion)
          Creates a CharArraySet from a file.
 
Methods inherited from class org.apache.lucene.analysis.ReusableAnalyzerBase
createComponents, initReader, reusableTokenStream, tokenStream
 
Methods inherited from class org.apache.lucene.analysis.Analyzer
close, getOffsetGap, getPositionIncrementGap, getPreviousTokenStream, setPreviousTokenStream
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

stopwords

protected final CharArraySet stopwords
An immutable stopword set


matchVersion

protected final Version matchVersion
Constructor Detail

StopwordAnalyzerBase

protected StopwordAnalyzerBase(Version version,
                               Set<?> stopwords)
Creates a new instance initialized with the given stopword set

Parameters:
version - the Lucene version for cross version compatibility
stopwords - the analyzer's stopword set

StopwordAnalyzerBase

protected StopwordAnalyzerBase(Version version)
Creates a new Analyzer with an empty stopword set

Parameters:
version - the Lucene version for cross version compatibility
Method Detail

getStopwordSet

public Set<?> getStopwordSet()
Returns the analyzer's stopword set or an empty set if the analyzer has no stopwords

Returns:
the analyzer's stopword set or an empty set if the analyzer has no stopwords

loadStopwordSet

protected static CharArraySet loadStopwordSet(boolean ignoreCase,
                                              Class<? extends ReusableAnalyzerBase> aClass,
                                              String resource,
                                              String comment)
                                       throws IOException
Creates a CharArraySet from a file resource associated with a class. (See Class.getResourceAsStream(String)).

Parameters:
ignoreCase - true if the set should ignore the case of the stopwords, otherwise false
aClass - a class that is associated with the given stopwordResource
resource - name of the resource file associated with the given class
comment - comment string to ignore in the stopword file
Returns:
a CharArraySet containing the distinct stopwords from the given file
Throws:
IOException - if loading the stopwords throws an IOException

loadStopwordSet

protected static CharArraySet loadStopwordSet(File stopwords,
                                              Version matchVersion)
                                       throws IOException
Creates a CharArraySet from a file.

Parameters:
stopwords - the stopwords file to load
matchVersion - the Lucene version for cross version compatibility
Returns:
a CharArraySet containing the distinct stopwords from the given file
Throws:
IOException - if loading the stopwords throws an IOException

loadStopwordSet

protected static CharArraySet loadStopwordSet(Reader stopwords,
                                              Version matchVersion)
                                       throws IOException
Creates a CharArraySet from a file.

Parameters:
stopwords - the stopwords reader to load
matchVersion - the Lucene version for cross version compatibility
Returns:
a CharArraySet containing the distinct stopwords from the given reader
Throws:
IOException - if loading the stopwords throws an IOException