org.apache.lucene.analysis.hunspell
Class HunspellStemmer

java.lang.Object
  extended by org.apache.lucene.analysis.hunspell.HunspellStemmer

public class HunspellStemmer
extends Object

HunspellStemmer uses the affix rules declared in the HunspellDictionary to generate one or more stems for a word. It conforms to the algorithm in the original hunspell algorithm, including recursive suffix stripping.


Nested Class Summary
static class HunspellStemmer.Stem
          Stem represents all information known about a stem of a word.
 
Constructor Summary
HunspellStemmer(HunspellDictionary dictionary)
          Constructs a new HunspellStemmer which will use the provided HunspellDictionary to create its stems
 
Method Summary
 List<HunspellStemmer.Stem> applyAffix(char[] strippedWord, int length, HunspellAffix affix, int recursionDepth)
          Applies the affix rule to the given word, producing a list of stems if any are found
static void main(String[] args)
          HunspellStemmer entry point.
 List<HunspellStemmer.Stem> stem(char[] word, int length)
          Find the stem(s) of the provided word
 List<HunspellStemmer.Stem> stem(String word)
          Find the stem(s) of the provided word
 List<HunspellStemmer.Stem> uniqueStems(char[] word, int length)
          Find the unique stem(s) of the provided word
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HunspellStemmer

public HunspellStemmer(HunspellDictionary dictionary)
Constructs a new HunspellStemmer which will use the provided HunspellDictionary to create its stems

Parameters:
dictionary - HunspellDictionary that will be used to create the stems
Method Detail

stem

public List<HunspellStemmer.Stem> stem(String word)
Find the stem(s) of the provided word

Parameters:
word - Word to find the stems for
Returns:
List of stems for the word

stem

public List<HunspellStemmer.Stem> stem(char[] word,
                                       int length)
Find the stem(s) of the provided word

Parameters:
word - Word to find the stems for
Returns:
List of stems for the word

uniqueStems

public List<HunspellStemmer.Stem> uniqueStems(char[] word,
                                              int length)
Find the unique stem(s) of the provided word

Parameters:
word - Word to find the stems for
Returns:
List of stems for the word

applyAffix

public List<HunspellStemmer.Stem> applyAffix(char[] strippedWord,
                                             int length,
                                             HunspellAffix affix,
                                             int recursionDepth)
Applies the affix rule to the given word, producing a list of stems if any are found

Parameters:
strippedWord - Word the affix has been removed and the strip added
affix - HunspellAffix representing the affix rule itself
recursionDepth - Level of recursion this stemming step is at
Returns:
List of stems for the word, or an empty list if none are found

main

public static void main(String[] args)
                 throws IOException,
                        ParseException
HunspellStemmer entry point. Accepts two arguments: location of affix file and location of dic file

Parameters:
args - Program arguments. Should contain location of affix file and location of dic file
Throws:
IOException - Can be thrown while reading from the files
ParseException - Can be thrown while parsing the files