| 
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectorg.apache.lucene.search.spell.SpellChecker
public class SpellChecker
   Spell Checker class  (Main class) 
  (initially inspired by the David Spencer code).
 
Example Usage:
  SpellChecker spellchecker = new SpellChecker(spellIndexDirectory);
  // To index a field of a user index:
  spellchecker.indexDictionary(new LuceneDictionary(my_lucene_reader, a_field));
  // To index a file containing words:
  spellchecker.indexDictionary(new PlainTextDictionary(new File("myfile.txt")));
  String[] suggestions = spellchecker.suggestSimilar("misspelt", 5);
 
| Field Summary | |
|---|---|
static float | 
DEFAULT_ACCURACY
The default minimum score to use, if not specified by calling setAccuracy(float) . | 
static String | 
F_WORD
Field name for each word in the ngram index.  | 
| Constructor Summary | |
|---|---|
SpellChecker(Directory spellIndex)
Use the given directory as a spell checker index with a LevensteinDistance as the default StringDistance. | 
|
SpellChecker(Directory spellIndex,
             StringDistance sd)
Use the given directory as a spell checker index.  | 
|
SpellChecker(Directory spellIndex,
             StringDistance sd,
             Comparator<SuggestWord> comparator)
Use the given directory as a spell checker index with the given StringDistance measure
 and the given Comparator for sorting the results. | 
|
| Method Summary | |
|---|---|
 void | 
clearIndex()
Removes all terms from the spell check index.  | 
 void | 
close()
Close the IndexSearcher used by this SpellChecker  | 
 boolean | 
exist(String word)
Check whether the word exists in the index.  | 
 float | 
getAccuracy()
The accuracy (minimum score) to be used, unless overridden in suggestSimilar(String, int, IndexReader, String, SuggestMode, float), to
 decide whether a suggestion is included or not. | 
 Comparator<SuggestWord> | 
getComparator()
Gets the comparator in use for ranking suggestions.  | 
 StringDistance | 
getStringDistance()
Returns the StringDistance instance used by this
 SpellChecker instance. | 
 void | 
indexDictionary(Dictionary dict,
                IndexWriterConfig config,
                boolean fullMerge)
Indexes the data from the given Dictionary. | 
 void | 
setAccuracy(float acc)
Sets the accuracy 0 < minScore < 1; default DEFAULT_ACCURACY | 
 void | 
setComparator(Comparator<SuggestWord> comparator)
Sets the Comparator for the SuggestWordQueue. | 
 void | 
setSpellIndex(Directory spellIndexDir)
Use a different index as the spell checker index or re-open the existing index if spellIndex is the same value
 as given in the constructor. | 
 void | 
setStringDistance(StringDistance sd)
Sets the StringDistance implementation for this
 SpellChecker instance. | 
 String[] | 
suggestSimilar(String word,
               int numSug)
Suggest similar words.  | 
 String[] | 
suggestSimilar(String word,
               int numSug,
               float accuracy)
Suggest similar words.  | 
 String[] | 
suggestSimilar(String word,
               int numSug,
               IndexReader ir,
               String field,
               SuggestMode suggestMode)
Calls suggestSimilar(word, numSug, ir, suggestMode, field, this.accuracy) | 
 String[] | 
suggestSimilar(String word,
               int numSug,
               IndexReader ir,
               String field,
               SuggestMode suggestMode,
               float accuracy)
Suggest similar words (optionally restricted to a field of an index).  | 
| Methods inherited from class java.lang.Object | 
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait | 
| Field Detail | 
|---|
public static final float DEFAULT_ACCURACY
setAccuracy(float) .
public static final String F_WORD
| Constructor Detail | 
|---|
public SpellChecker(Directory spellIndex,
                    StringDistance sd)
             throws IOException
spellIndex - the spell index directorysd - the StringDistance measurement to use
IOException - if Spellchecker can not open the directory
public SpellChecker(Directory spellIndex)
             throws IOException
LevensteinDistance as the default StringDistance. The
 directory is created if it doesn't exist yet.
spellIndex - the spell index directory
IOException - if spellchecker can not open the directory
public SpellChecker(Directory spellIndex,
                    StringDistance sd,
                    Comparator<SuggestWord> comparator)
             throws IOException
StringDistance measure
 and the given Comparator for sorting the results.
spellIndex - The spelling indexsd - The distancecomparator - The comparator
IOException - if there is a problem opening the index| Method Detail | 
|---|
public void setSpellIndex(Directory spellIndexDir)
                   throws IOException
spellIndex is the same value
 as given in the constructor.
spellIndexDir - the spell directory to use
AlreadyClosedException - if the Spellchecker is already closed
IOException - if spellchecker can not open the directorypublic void setComparator(Comparator<SuggestWord> comparator)
Comparator for the SuggestWordQueue.
comparator - the comparatorpublic Comparator<SuggestWord> getComparator()
setComparator(Comparator)public void setStringDistance(StringDistance sd)
StringDistance implementation for this
 SpellChecker instance.
sd - the StringDistance implementation for this
 SpellChecker instancepublic StringDistance getStringDistance()
StringDistance instance used by this
 SpellChecker instance.
StringDistance instance used by this
         SpellChecker instance.public void setAccuracy(float acc)
DEFAULT_ACCURACY
acc - The new accuracypublic float getAccuracy()
suggestSimilar(String, int, IndexReader, String, SuggestMode, float), to
 decide whether a suggestion is included or not.
public String[] suggestSimilar(String word,
                               int numSug)
                        throws IOException
As the Lucene similarity that is used to fetch the most relevant n-grammed terms is not the same as the edit distance strategy used to calculate the best matching spell-checked word from the hits that Lucene found, one usually has to retrieve a couple of numSug's in order to get the true best match.
I.e. if numSug == 1, don't count on that suggestion being the best one. Thus, you should set this value to at least 5 for a good suggestion.
word - the word you want a spell check done onnumSug - the number of suggested words
IOException - if the underlying index throws an IOException
AlreadyClosedException - if the Spellchecker is already closedsuggestSimilar(String, int, IndexReader, String, SuggestMode, float)
public String[] suggestSimilar(String word,
                               int numSug,
                               float accuracy)
                        throws IOException
As the Lucene similarity that is used to fetch the most relevant n-grammed terms is not the same as the edit distance strategy used to calculate the best matching spell-checked word from the hits that Lucene found, one usually has to retrieve a couple of numSug's in order to get the true best match.
I.e. if numSug == 1, don't count on that suggestion being the best one. Thus, you should set this value to at least 5 for a good suggestion.
word - the word you want a spell check done onnumSug - the number of suggested wordsaccuracy - The minimum score a suggestion must have in order to qualify for inclusion in the results
IOException - if the underlying index throws an IOException
AlreadyClosedException - if the Spellchecker is already closedsuggestSimilar(String, int, IndexReader, String, SuggestMode, float)
public String[] suggestSimilar(String word,
                               int numSug,
                               IndexReader ir,
                               String field,
                               SuggestMode suggestMode)
                        throws IOException
suggestSimilar(word, numSug, ir, suggestMode, field, this.accuracy)
IOException
public String[] suggestSimilar(String word,
                               int numSug,
                               IndexReader ir,
                               String field,
                               SuggestMode suggestMode,
                               float accuracy)
                        throws IOException
As the Lucene similarity that is used to fetch the most relevant n-grammed terms is not the same as the edit distance strategy used to calculate the best matching spell-checked word from the hits that Lucene found, one usually has to retrieve a couple of numSug's in order to get the true best match.
I.e. if numSug == 1, don't count on that suggestion being the best one. Thus, you should set this value to at least 5 for a good suggestion.
word - the word you want a spell check done onnumSug - the number of suggested wordsir - the indexReader of the user index (can be null see field param)field - the field of the user index: if field is not null, the suggested
 words are restricted to the words present in this field.suggestMode - (NOTE: if indexReader==null and/or field==null, then this is overridden with SuggestMode.SUGGEST_ALWAYS)accuracy - The minimum score a suggestion must have in order to qualify for inclusion in the results
IOException - if the underlying index throws an IOException
AlreadyClosedException - if the Spellchecker is already closed
public void clearIndex()
                throws IOException
IOException - If there is a low-level I/O error.
AlreadyClosedException - if the Spellchecker is already closed
public boolean exist(String word)
              throws IOException
word - word to check
IOException - If there is a low-level I/O error.
AlreadyClosedException - if the Spellchecker is already closed
public final void indexDictionary(Dictionary dict,
                                  IndexWriterConfig config,
                                  boolean fullMerge)
                           throws IOException
Dictionary.
dict - Dictionary to indexconfig - IndexWriterConfig to usefullMerge - whether or not the spellcheck index should be fully merged
AlreadyClosedException - if the Spellchecker is already closed
IOException - If there is a low-level I/O error.
public void close()
           throws IOException
close in interface CloseableIOException - if the close operation causes an IOException
AlreadyClosedException - if the SpellChecker is already closed
  | 
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||