org.apache.lucene.analysis.ja.dict
Class BinaryDictionary

java.lang.Object
  extended by org.apache.lucene.analysis.ja.dict.BinaryDictionary
All Implemented Interfaces:
Dictionary
Direct Known Subclasses:
TokenInfoDictionary, UnknownDictionary

public abstract class BinaryDictionary
extends Object
implements Dictionary

Base class for a binary-encoded in-memory dictionary.


Field Summary
static String DICT_FILENAME_SUFFIX
           
static String DICT_HEADER
           
static int HAS_BASEFORM
          flag that the entry has baseform data.
static int HAS_PRONUNCIATION
          flag that the entry has pronunciation data.
static int HAS_READING
          flag that the entry has reading data.
static String POSDICT_FILENAME_SUFFIX
           
static String POSDICT_HEADER
           
static String TARGETMAP_FILENAME_SUFFIX
           
static String TARGETMAP_HEADER
           
static int VERSION
           
 
Fields inherited from interface org.apache.lucene.analysis.ja.dict.Dictionary
INTERNAL_SEPARATOR
 
Constructor Summary
protected BinaryDictionary()
           
 
Method Summary
 String getBaseForm(int wordId, char[] surfaceForm, int off, int len)
          Get base form of word
static InputStream getClassResource(Class<?> clazz, String suffix)
           
 String getInflectionForm(int wordId)
          Get inflection form of tokens
 String getInflectionType(int wordId)
          Get inflection type of tokens
 int getLeftId(int wordId)
          Get left id of specified word
 String getPartOfSpeech(int wordId)
          Get Part-Of-Speech of tokens
 String getPronunciation(int wordId, char[] surface, int off, int len)
          Get pronunciation of tokens
 String getReading(int wordId, char[] surface, int off, int len)
          Get reading of tokens
protected  InputStream getResource(String suffix)
           
 int getRightId(int wordId)
          Get right id of specified word
 int getWordCost(int wordId)
          Get word cost of specified word
 void lookupWordIds(int sourceId, IntsRef ref)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DICT_FILENAME_SUFFIX

public static final String DICT_FILENAME_SUFFIX
See Also:
Constant Field Values

TARGETMAP_FILENAME_SUFFIX

public static final String TARGETMAP_FILENAME_SUFFIX
See Also:
Constant Field Values

POSDICT_FILENAME_SUFFIX

public static final String POSDICT_FILENAME_SUFFIX
See Also:
Constant Field Values

DICT_HEADER

public static final String DICT_HEADER
See Also:
Constant Field Values

TARGETMAP_HEADER

public static final String TARGETMAP_HEADER
See Also:
Constant Field Values

POSDICT_HEADER

public static final String POSDICT_HEADER
See Also:
Constant Field Values

VERSION

public static final int VERSION
See Also:
Constant Field Values

HAS_BASEFORM

public static final int HAS_BASEFORM
flag that the entry has baseform data. otherwise its not inflected (same as surface form)

See Also:
Constant Field Values

HAS_READING

public static final int HAS_READING
flag that the entry has reading data. otherwise reading is surface form converted to katakana

See Also:
Constant Field Values

HAS_PRONUNCIATION

public static final int HAS_PRONUNCIATION
flag that the entry has pronunciation data. otherwise pronunciation is the reading

See Also:
Constant Field Values
Constructor Detail

BinaryDictionary

protected BinaryDictionary()
                    throws IOException
Throws:
IOException
Method Detail

getResource

protected final InputStream getResource(String suffix)
                                 throws IOException
Throws:
IOException

getClassResource

public static final InputStream getClassResource(Class<?> clazz,
                                                 String suffix)
                                          throws IOException
Throws:
IOException

lookupWordIds

public void lookupWordIds(int sourceId,
                          IntsRef ref)

getLeftId

public int getLeftId(int wordId)
Description copied from interface: Dictionary
Get left id of specified word

Specified by:
getLeftId in interface Dictionary
Returns:
left id

getRightId

public int getRightId(int wordId)
Description copied from interface: Dictionary
Get right id of specified word

Specified by:
getRightId in interface Dictionary
Returns:
left id

getWordCost

public int getWordCost(int wordId)
Description copied from interface: Dictionary
Get word cost of specified word

Specified by:
getWordCost in interface Dictionary
Returns:
left id

getBaseForm

public String getBaseForm(int wordId,
                          char[] surfaceForm,
                          int off,
                          int len)
Description copied from interface: Dictionary
Get base form of word

Specified by:
getBaseForm in interface Dictionary
Parameters:
wordId - word ID of token
Returns:
Base form (only different for inflected words, otherwise null)

getReading

public String getReading(int wordId,
                         char[] surface,
                         int off,
                         int len)
Description copied from interface: Dictionary
Get reading of tokens

Specified by:
getReading in interface Dictionary
Parameters:
wordId - word ID of token
Returns:
Reading of the token

getPartOfSpeech

public String getPartOfSpeech(int wordId)
Description copied from interface: Dictionary
Get Part-Of-Speech of tokens

Specified by:
getPartOfSpeech in interface Dictionary
Parameters:
wordId - word ID of token
Returns:
Part-Of-Speech of the token

getPronunciation

public String getPronunciation(int wordId,
                               char[] surface,
                               int off,
                               int len)
Description copied from interface: Dictionary
Get pronunciation of tokens

Specified by:
getPronunciation in interface Dictionary
Parameters:
wordId - word ID of token
Returns:
Pronunciation of the token

getInflectionType

public String getInflectionType(int wordId)
Description copied from interface: Dictionary
Get inflection type of tokens

Specified by:
getInflectionType in interface Dictionary
Parameters:
wordId - word ID of token
Returns:
inflection type, or null

getInflectionForm

public String getInflectionForm(int wordId)
Description copied from interface: Dictionary
Get inflection form of tokens

Specified by:
getInflectionForm in interface Dictionary
Parameters:
wordId - word ID of token
Returns:
inflection form, or null