DefaultSimilarity (Lucene 3.6.0 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lucene.search
Class DefaultSimilarity

java.lang.Object
  org.apache.lucene.search.Similarity
      org.apache.lucene.search.DefaultSimilarity

All Implemented Interfaces:: Serializable

Direct Known Subclasses:: SweetSpotSimilarity

public class DefaultSimilarity
extends Similarity
extends Similarity

Expert: Default scoring implementation.

See Also:: Serialized Form

Field Summary
`protected boolean`	`discountOverlaps`

Fields inherited from class org.apache.lucene.search.Similarity
`NO_DOC_ID_PROVIDED`

Constructor Summary
`DefaultSimilarity()`

Method Summary
`float`	`computeNorm(String field, FieldInvertState state)` Implemented as `state.getBoost()*lengthNorm(numTerms)`, where `numTerms` is `FieldInvertState.getLength()` if `setDiscountOverlaps(boolean)` is false, else it's `FieldInvertState.getLength()` - `FieldInvertState.getNumOverlap()`.
`float`	`coord(int overlap, int maxOverlap)` Implemented as `overlap / maxOverlap`.
`boolean`	`getDiscountOverlaps()`
`float`	`idf(int docFreq, int numDocs)` Implemented as `log(numDocs/(docFreq+1)) + 1`.
`float`	`queryNorm(float sumOfSquaredWeights)` Implemented as `1/sqrt(sumOfSquaredWeights)`.
`void`	`setDiscountOverlaps(boolean v)` Determines whether overlap tokens (Tokens with 0 position increment) are ignored when computing norm.
`float`	`sloppyFreq(int distance)` Implemented as `1 / (distance + 1)`.
`float`	`tf(float freq)` Implemented as `sqrt(freq)`.

Methods inherited from class org.apache.lucene.search.Similarity
`decodeNorm, decodeNormValue, encodeNorm, encodeNormValue, getDefault, getNormDecoder, idfExplain, idfExplain, idfExplain, lengthNorm, scorePayload, setDefault, tf`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

discountOverlaps

protected boolean discountOverlaps

Constructor Detail

DefaultSimilarity

public DefaultSimilarity()

Method Detail

computeNorm

public float computeNorm(String field,
                         FieldInvertState state)

Implemented as state.getBoost()*lengthNorm(numTerms), where numTerms is FieldInvertState.getLength() if setDiscountOverlaps(boolean) is false, else it's FieldInvertState.getLength() - FieldInvertState.getNumOverlap().

Specified by:: computeNorm in class Similarity

Parameters:: field - field name; state - current processing state for this field
Returns:: the calculated float norm
WARNING: This API is experimental and might change in incompatible ways in the next release.

queryNorm

public float queryNorm(float sumOfSquaredWeights)

Implemented as 1/sqrt(sumOfSquaredWeights).

Specified by:: queryNorm in class Similarity

Parameters:: sumOfSquaredWeights - the sum of the squares of query term weights
Returns:: a normalization factor for query weights

tf

public float tf(float freq)

Implemented as sqrt(freq).

Specified by:: tf in class Similarity

Parameters:: freq - the frequency of a term within a document
Returns:: a score factor based on a term's within-document frequency

sloppyFreq

public float sloppyFreq(int distance)

Implemented as 1 / (distance + 1).

Specified by:: sloppyFreq in class Similarity

Parameters:: distance - the edit distance of this sloppy phrase match
Returns:: the frequency increment for this match
See Also:: PhraseQuery.setSlop(int)

idf

public float idf(int docFreq,
                 int numDocs)

Implemented as log(numDocs/(docFreq+1)) + 1.

Specified by:: idf in class Similarity

Parameters:: docFreq - the number of documents which contain the term; numDocs - the total number of documents in the collection
Returns:: a score factor based on the term's document frequency

coord

public float coord(int overlap,
                   int maxOverlap)

Implemented as overlap / maxOverlap.

Specified by:: coord in class Similarity

Parameters:: overlap - the number of query terms matched in the document; maxOverlap - the total number of terms in the query
Returns:: a score factor based on term overlap with the query

setDiscountOverlaps

public void setDiscountOverlaps(boolean v)

Determines whether overlap tokens (Tokens with 0 position increment) are ignored when computing norm. By default this is true, meaning overlap tokens do not count when computing norms.

See Also:: computeNorm(java.lang.String, org.apache.lucene.index.FieldInvertState)
WARNING: This API is experimental and might change in incompatible ways in the next release.

getDiscountOverlaps

public boolean getDiscountOverlaps()

See Also:: setDiscountOverlaps(boolean)