org.apache.lucene.util.fst
Class FST<T>

java.lang.Object
  extended by org.apache.lucene.util.fst.FST<T>

public final class FST<T>
extends Object

Represents an finite state machine (FST), using a compact byte[] format.

The format is similar to what's used by Morfologik (http://sourceforge.net/projects/morfologik).

NOTE: the FST cannot be larger than ~2.1 GB because it uses int to address the byte[].

WARNING: This API is experimental and might change in incompatible ways in the next release.

Nested Class Summary
static class FST.Arc<T>
          Represents a single arc.
static class FST.BytesReader
          Reads the bytes from this FST.
static class FST.INPUT_TYPE
          Specifies allowed range of each int input label for this FST.
 
Field Summary
 int arcCount
           
 int arcWithOutputCount
           
static int END_LABEL
           
 FST.INPUT_TYPE inputType
           
 int nodeCount
           
 Outputs<T> outputs
           
 
Constructor Summary
FST(DataInput in, Outputs<T> outputs)
          Load a previously saved FST.
 
Method Summary
 FST.Arc<T> findTargetArc(int labelToMatch, FST.Arc<T> follow, FST.Arc<T> arc, FST.BytesReader in)
          Finds an arc leaving the incoming arc, replacing the arc in place.
 int getArcCount()
           
 int getArcWithOutputCount()
           
 FST.BytesReader getBytesReader(int pos)
           
 T getEmptyOutput()
           
 FST.Arc<T> getFirstArc(FST.Arc<T> arc)
          Fills virtual 'start' arc, ie, an empty incoming arc to the FST's start node
 FST.INPUT_TYPE getInputType()
           
 int getNodeCount()
           
 FST<T> pack(int minInCountDeref, int maxDerefNodes)
          Expert: creates an FST by packing this one.
static
<T> FST<T>
read(File file, Outputs<T> outputs)
          Reads an automaton from a file.
 FST.Arc<T> readFirstRealTargetArc(int node, FST.Arc<T> arc, FST.BytesReader in)
           
 FST.Arc<T> readFirstTargetArc(FST.Arc<T> follow, FST.Arc<T> arc)
          Follow the follow arc and read the first arc of its target; this changes the provided arc (2nd arg) in-place and returns it.
 FST.Arc<T> readLastTargetArc(FST.Arc<T> follow, FST.Arc<T> arc)
          Follows the follow arc and reads the last arc of its target; this changes the provided arc (2nd arg) in-place and returns it.
 FST.Arc<T> readNextArc(FST.Arc<T> arc)
          In-place read; returns the arc.
 int readNextArcLabel(FST.Arc<T> arc)
          Peeks at next arc's label; does not alter arc.
 FST.Arc<T> readNextRealArc(FST.Arc<T> arc, FST.BytesReader in)
          Never returns null, but you should never call this if arc.isLast() is true.
 void save(DataOutput out)
           
 void save(File file)
          Writes an automaton to a file.
 void setAllowArrayArcs(boolean v)
           
 int sizeInBytes()
          Returns bytes used to represent the FST
static
<T> boolean
targetHasArcs(FST.Arc<T> arc)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

inputType

public final FST.INPUT_TYPE inputType

outputs

public final Outputs<T> outputs

nodeCount

public int nodeCount

arcCount

public int arcCount

arcWithOutputCount

public int arcWithOutputCount

END_LABEL

public static final int END_LABEL
See Also:
Constant Field Values
Constructor Detail

FST

public FST(DataInput in,
           Outputs<T> outputs)
    throws IOException
Load a previously saved FST.

Throws:
IOException
Method Detail

getInputType

public FST.INPUT_TYPE getInputType()

sizeInBytes

public int sizeInBytes()
Returns bytes used to represent the FST


getEmptyOutput

public T getEmptyOutput()

save

public void save(DataOutput out)
          throws IOException
Throws:
IOException

save

public void save(File file)
          throws IOException
Writes an automaton to a file.

Throws:
IOException

read

public static <T> FST<T> read(File file,
                              Outputs<T> outputs)
                   throws IOException
Reads an automaton from a file.

Throws:
IOException

targetHasArcs

public static <T> boolean targetHasArcs(FST.Arc<T> arc)

getFirstArc

public FST.Arc<T> getFirstArc(FST.Arc<T> arc)
Fills virtual 'start' arc, ie, an empty incoming arc to the FST's start node


readLastTargetArc

public FST.Arc<T> readLastTargetArc(FST.Arc<T> follow,
                                    FST.Arc<T> arc)
                             throws IOException
Follows the follow arc and reads the last arc of its target; this changes the provided arc (2nd arg) in-place and returns it.

Returns:
Returns the second argument (arc).
Throws:
IOException

readFirstTargetArc

public FST.Arc<T> readFirstTargetArc(FST.Arc<T> follow,
                                     FST.Arc<T> arc)
                              throws IOException
Follow the follow arc and read the first arc of its target; this changes the provided arc (2nd arg) in-place and returns it.

Returns:
Returns the second argument (arc).
Throws:
IOException

readFirstRealTargetArc

public FST.Arc<T> readFirstRealTargetArc(int node,
                                         FST.Arc<T> arc,
                                         FST.BytesReader in)
                                  throws IOException
Throws:
IOException

readNextArc

public FST.Arc<T> readNextArc(FST.Arc<T> arc)
                       throws IOException
In-place read; returns the arc.

Throws:
IOException

readNextArcLabel

public int readNextArcLabel(FST.Arc<T> arc)
                     throws IOException
Peeks at next arc's label; does not alter arc. Do not call this if arc.isLast()!

Throws:
IOException

readNextRealArc

public FST.Arc<T> readNextRealArc(FST.Arc<T> arc,
                                  FST.BytesReader in)
                           throws IOException
Never returns null, but you should never call this if arc.isLast() is true.

Throws:
IOException

findTargetArc

public FST.Arc<T> findTargetArc(int labelToMatch,
                                FST.Arc<T> follow,
                                FST.Arc<T> arc,
                                FST.BytesReader in)
                         throws IOException
Finds an arc leaving the incoming arc, replacing the arc in place. This returns null if the arc was not found, else the incoming arc.

Throws:
IOException

getNodeCount

public int getNodeCount()

getArcCount

public int getArcCount()

getArcWithOutputCount

public int getArcWithOutputCount()

setAllowArrayArcs

public void setAllowArrayArcs(boolean v)

getBytesReader

public final FST.BytesReader getBytesReader(int pos)

pack

public FST<T> pack(int minInCountDeref,
                   int maxDerefNodes)
            throws IOException
Expert: creates an FST by packing this one. This process requires substantial additional RAM (currently ~8 bytes per node), but then should produce a smaller FST.

Throws:
IOException