|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectorg.apache.lucene.analysis.util.WordlistLoader
public class WordlistLoader
Loader for text files that represent a list of stopwords.
to obtain {@link Reader} instances| Method Summary | |
|---|---|
static List<String> |
getLines(InputStream stream,
Charset charset)
Accesses a resource by name and returns the (non comment) lines containing data using the given character encoding. |
static CharArraySet |
getSnowballWordSet(Reader reader,
CharArraySet result)
Reads stopwords from a stopword list in Snowball format. |
static CharArraySet |
getSnowballWordSet(Reader reader,
Version matchVersion)
Reads stopwords from a stopword list in Snowball format. |
static CharArrayMap<String> |
getStemDict(Reader reader,
CharArrayMap<String> result)
Reads a stem dictionary. |
static CharArraySet |
getWordSet(Reader reader,
CharArraySet result)
Reads lines from a Reader and adds every line as an entry to a CharArraySet (omitting leading and trailing whitespace). |
static CharArraySet |
getWordSet(Reader reader,
String comment,
CharArraySet result)
Reads lines from a Reader and adds every non-comment line as an entry to a CharArraySet (omitting leading and trailing whitespace). |
static CharArraySet |
getWordSet(Reader reader,
String comment,
Version matchVersion)
Reads lines from a Reader and adds every non-comment line as an entry to a CharArraySet (omitting leading and trailing whitespace). |
static CharArraySet |
getWordSet(Reader reader,
Version matchVersion)
Reads lines from a Reader and adds every line as an entry to a CharArraySet (omitting leading and trailing whitespace). |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Method Detail |
|---|
public static CharArraySet getWordSet(Reader reader,
CharArraySet result)
throws IOException
reader - Reader containing the wordlistresult - the CharArraySet to fill with the readers words
CharArraySet with the reader's words
IOException
public static CharArraySet getWordSet(Reader reader,
Version matchVersion)
throws IOException
reader - Reader containing the wordlistmatchVersion - the Lucene Version
CharArraySet with the reader's words
IOException
public static CharArraySet getWordSet(Reader reader,
String comment,
Version matchVersion)
throws IOException
reader - Reader containing the wordlistcomment - The string representing a comment.matchVersion - the Lucene Version
IOException
public static CharArraySet getWordSet(Reader reader,
String comment,
CharArraySet result)
throws IOException
reader - Reader containing the wordlistcomment - The string representing a comment.result - the CharArraySet to fill with the readers words
CharArraySet with the reader's words
IOException
public static CharArraySet getSnowballWordSet(Reader reader,
CharArraySet result)
throws IOException
The snowball format is the following:
reader - Reader containing a Snowball stopword listresult - the CharArraySet to fill with the readers words
CharArraySet with the reader's words
IOException
public static CharArraySet getSnowballWordSet(Reader reader,
Version matchVersion)
throws IOException
The snowball format is the following:
reader - Reader containing a Snowball stopword listmatchVersion - the Lucene Version
CharArraySet with the reader's words
IOException
public static CharArrayMap<String> getStemDict(Reader reader,
CharArrayMap<String> result)
throws IOException
word\tstem(i.e. two tab separated words)
IOException - If there is a low-level I/O error.
public static List<String> getLines(InputStream stream,
Charset charset)
throws IOException
A comment line is any line that starts with the character "#"
IOException - If there is a low-level I/O error.
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||