org.apache.lucene.search
Class DefaultSimilarity

java.lang.Object
  extended by org.apache.lucene.search.Similarity
      extended by org.apache.lucene.search.DefaultSimilarity
All Implemented Interfaces:
Serializable
Direct Known Subclasses:
SweetSpotSimilarity

public class DefaultSimilarity
extends Similarity

Expert: Default scoring implementation.

See Also:
Serialized Form

Field Summary
protected  boolean discountOverlaps
           
 
Fields inherited from class org.apache.lucene.search.Similarity
NO_DOC_ID_PROVIDED
 
Constructor Summary
DefaultSimilarity()
           
 
Method Summary
 float computeNorm(String field, FieldInvertState state)
          Implemented as state.getBoost()*lengthNorm(numTerms), where numTerms is FieldInvertState.getLength() if setDiscountOverlaps(boolean) is false, else it's FieldInvertState.getLength() - FieldInvertState.getNumOverlap().
 float coord(int overlap, int maxOverlap)
          Implemented as overlap / maxOverlap.
 boolean getDiscountOverlaps()
           
 float idf(int docFreq, int numDocs)
          Implemented as log(numDocs/(docFreq+1)) + 1.
 float queryNorm(float sumOfSquaredWeights)
          Implemented as 1/sqrt(sumOfSquaredWeights).
 void setDiscountOverlaps(boolean v)
          Determines whether overlap tokens (Tokens with 0 position increment) are ignored when computing norm.
 float sloppyFreq(int distance)
          Implemented as 1 / (distance + 1).
 float tf(float freq)
          Implemented as sqrt(freq).
 
Methods inherited from class org.apache.lucene.search.Similarity
decodeNorm, decodeNormValue, encodeNorm, encodeNormValue, getDefault, getNormDecoder, idfExplain, idfExplain, idfExplain, lengthNorm, scorePayload, setDefault, tf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

discountOverlaps

protected boolean discountOverlaps
Constructor Detail

DefaultSimilarity

public DefaultSimilarity()
Method Detail

computeNorm

public float computeNorm(String field,
                         FieldInvertState state)
Implemented as state.getBoost()*lengthNorm(numTerms), where numTerms is FieldInvertState.getLength() if setDiscountOverlaps(boolean) is false, else it's FieldInvertState.getLength() - FieldInvertState.getNumOverlap().

Specified by:
computeNorm in class Similarity
Parameters:
field - field name
state - current processing state for this field
Returns:
the calculated float norm
WARNING: This API is experimental and might change in incompatible ways in the next release.

queryNorm

public float queryNorm(float sumOfSquaredWeights)
Implemented as 1/sqrt(sumOfSquaredWeights).

Specified by:
queryNorm in class Similarity
Parameters:
sumOfSquaredWeights - the sum of the squares of query term weights
Returns:
a normalization factor for query weights

tf

public float tf(float freq)
Implemented as sqrt(freq).

Specified by:
tf in class Similarity
Parameters:
freq - the frequency of a term within a document
Returns:
a score factor based on a term's within-document frequency

sloppyFreq

public float sloppyFreq(int distance)
Implemented as 1 / (distance + 1).

Specified by:
sloppyFreq in class Similarity
Parameters:
distance - the edit distance of this sloppy phrase match
Returns:
the frequency increment for this match
See Also:
PhraseQuery.setSlop(int)

idf

public float idf(int docFreq,
                 int numDocs)
Implemented as log(numDocs/(docFreq+1)) + 1.

Specified by:
idf in class Similarity
Parameters:
docFreq - the number of documents which contain the term
numDocs - the total number of documents in the collection
Returns:
a score factor based on the term's document frequency

coord

public float coord(int overlap,
                   int maxOverlap)
Implemented as overlap / maxOverlap.

Specified by:
coord in class Similarity
Parameters:
overlap - the number of query terms matched in the document
maxOverlap - the total number of terms in the query
Returns:
a score factor based on term overlap with the query

setDiscountOverlaps

public void setDiscountOverlaps(boolean v)
Determines whether overlap tokens (Tokens with 0 position increment) are ignored when computing norm. By default this is true, meaning overlap tokens do not count when computing norms.

See Also:
computeNorm(java.lang.String, org.apache.lucene.index.FieldInvertState)
WARNING: This API is experimental and might change in incompatible ways in the next release.

getDiscountOverlaps

public boolean getDiscountOverlaps()
See Also:
setDiscountOverlaps(boolean)