org.apache.lucene.facet.taxonomy.writercache.cl2o
Class CompactLabelToOrdinal

java.lang.Object
  extended by org.apache.lucene.facet.taxonomy.writercache.cl2o.LabelToOrdinal
      extended by org.apache.lucene.facet.taxonomy.writercache.cl2o.CompactLabelToOrdinal

public class CompactLabelToOrdinal
extends LabelToOrdinal

This is a very efficient LabelToOrdinal implementation that uses a CharBlockArray to store all labels and a configurable number of HashArrays to reference the labels.

Since the HashArrays don't handle collisions, a CollisionMap is used to store the colliding labels.

This data structure grows by adding a new HashArray whenever the number of collisions in the CollisionMap exceeds loadFactor * LabelToOrdinal.getMaxOrdinal(). Growing also includes reinserting all colliding labels into the HashArrays to possibly reduce the number of collisions. For setting the loadFactor see CompactLabelToOrdinal(int, float, int).

This data structure has a much lower memory footprint (~30%) compared to a Java HashMap. It also only uses a small fraction of objects a HashMap would use, thus limiting the GC overhead. Ingestion speed was also ~50% faster compared to a HashMap for 3M unique labels.

WARNING: This API is experimental and might change in incompatible ways in the next release.

Field Summary
static float DefaultLoadFactor
           
 
Fields inherited from class org.apache.lucene.facet.taxonomy.writercache.cl2o.LabelToOrdinal
counter, InvalidOrdinal
 
Constructor Summary
CompactLabelToOrdinal(int initialCapacity, float loadFactor, int numHashArrays)
           
 
Method Summary
 void addLabel(CategoryPath label, int ordinal)
          Adds a new label if its not yet in the table.
 void addLabel(CategoryPath label, int prefixLen, int ordinal)
          Adds a new label if its not yet in the table.
 int getOrdinal(CategoryPath label)
           
 int getOrdinal(CategoryPath label, int prefixLen)
           
 int sizeOfMap()
           
 
Methods inherited from class org.apache.lucene.facet.taxonomy.writercache.cl2o.LabelToOrdinal
getMaxOrdinal, getNextOrdinal
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DefaultLoadFactor

public static final float DefaultLoadFactor
See Also:
Constant Field Values
Constructor Detail

CompactLabelToOrdinal

public CompactLabelToOrdinal(int initialCapacity,
                             float loadFactor,
                             int numHashArrays)
Method Detail

sizeOfMap

public int sizeOfMap()

addLabel

public void addLabel(CategoryPath label,
                     int ordinal)
Description copied from class: LabelToOrdinal
Adds a new label if its not yet in the table. Throws an IllegalArgumentException if the same label with a different ordinal was previoulsy added to this table.

Specified by:
addLabel in class LabelToOrdinal

addLabel

public void addLabel(CategoryPath label,
                     int prefixLen,
                     int ordinal)
Description copied from class: LabelToOrdinal
Adds a new label if its not yet in the table. Throws an IllegalArgumentException if the same label with a different ordinal was previoulsy added to this table.

Specified by:
addLabel in class LabelToOrdinal

getOrdinal

public int getOrdinal(CategoryPath label)
Specified by:
getOrdinal in class LabelToOrdinal
Returns:
the ordinal assigned to the given label, or LabelToOrdinal.InvalidOrdinal if the label cannot be found in this table.

getOrdinal

public int getOrdinal(CategoryPath label,
                      int prefixLen)
Specified by:
getOrdinal in class LabelToOrdinal
Returns:
the ordinal assigned to the given label, or LabelToOrdinal.InvalidOrdinal if the label cannot be found in this table.