FuzzyQuery (Lucene 3.6.0 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lucene.search
Class FuzzyQuery

java.lang.Object
  org.apache.lucene.search.Query
      org.apache.lucene.search.MultiTermQuery
          org.apache.lucene.search.FuzzyQuery

All Implemented Interfaces:: Serializable, Cloneable

public class FuzzyQuery
extends MultiTermQuery
extends MultiTermQuery

Implements the fuzzy search query. The similarity measurement is based on the Levenshtein (edit distance) algorithm.

Warning: this query is not very scalable with its default prefix length of 0 - in this case, *every* term will be enumerated and cause an edit score calculation.

This query uses MultiTermQuery.TopTermsScoringBooleanQueryRewrite as default. So terms will be collected and scored according to their edit distance. Only the top terms are used for building the BooleanQuery. It is not recommended to change the rewrite mode for fuzzy queries.

See Also:: Serialized Form

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.lucene.search.MultiTermQuery
`MultiTermQuery.ConstantScoreAutoRewrite, MultiTermQuery.RewriteMethod, MultiTermQuery.TopTermsBoostOnlyBooleanQueryRewrite, MultiTermQuery.TopTermsScoringBooleanQueryRewrite`

Field Summary
`static int`	`defaultMaxExpansions`
`static float`	`defaultMinSimilarity`
`static int`	`defaultPrefixLength`
`protected Term`	`term`

Fields inherited from class org.apache.lucene.search.MultiTermQuery
`CONSTANT_SCORE_AUTO_REWRITE_DEFAULT, CONSTANT_SCORE_BOOLEAN_QUERY_REWRITE, CONSTANT_SCORE_FILTER_REWRITE, rewriteMethod, SCORING_BOOLEAN_QUERY_REWRITE`

Constructor Summary
`FuzzyQuery(Term term)` Calls `FuzzyQuery(term, 0.5f, 0, Integer.MAX_VALUE)`.
`FuzzyQuery(Term term, float minimumSimilarity)` Calls `FuzzyQuery(term, minimumSimilarity, 0, Integer.MAX_VALUE)`.
`FuzzyQuery(Term term, float minimumSimilarity, int prefixLength)` Calls `FuzzyQuery(term, minimumSimilarity, prefixLength, Integer.MAX_VALUE)`.
`FuzzyQuery(Term term, float minimumSimilarity, int prefixLength, int maxExpansions)` Create a new FuzzyQuery that will match terms with a similarity of at least `minimumSimilarity` to `term`.

Method Summary
`boolean`	`equals(Object obj)`
`protected FilteredTermEnum`	`getEnum(IndexReader reader)` Construct the enumeration to be used, expanding the pattern term.
`float`	`getMinSimilarity()` Returns the minimum similarity that is required for this query to match.
`int`	`getPrefixLength()` Returns the non-fuzzy prefix length.
`Term`	`getTerm()` Returns the pattern term.
`int`	`hashCode()`
`String`	`toString(String field)` Prints a query to a string, with `field` assumed to be the default field and omitted.

Methods inherited from class org.apache.lucene.search.MultiTermQuery
`clearTotalNumberOfTerms, getRewriteMethod, getTotalNumberOfTerms, incTotalNumberOfTerms, rewrite, setRewriteMethod`

Methods inherited from class org.apache.lucene.search.Query
`clone, combine, createWeight, extractTerms, getBoost, getSimilarity, mergeBooleanQueries, setBoost, toString, weight`

Methods inherited from class java.lang.Object
`finalize, getClass, notify, notifyAll, wait, wait, wait`

Field Detail

defaultMinSimilarity

public static final float defaultMinSimilarity

See Also:: Constant Field Values

defaultPrefixLength

public static final int defaultPrefixLength

See Also:: Constant Field Values

defaultMaxExpansions

public static final int defaultMaxExpansions

See Also:: Constant Field Values

term

protected Term term

Constructor Detail

FuzzyQuery

public FuzzyQuery(Term term,
                  float minimumSimilarity,
                  int prefixLength,
                  int maxExpansions)

Create a new FuzzyQuery that will match terms with a similarity of at least minimumSimilarity to term. If a prefixLength > 0 is specified, a common prefix of that length is also required.

Parameters:: term - the term to search for; minimumSimilarity - a value between 0 and 1 to set the required similarity between the query term and the matching terms. For example, for a minimumSimilarity of 0.5 a term of the same length as the query term is considered similar to the query term if the edit distance between both terms is less than length(term)*0.5; prefixLength - length of common (non-fuzzy) prefix; maxExpansions - the maximum number of terms to match. If this number is greater than BooleanQuery.getMaxClauseCount() when the query is rewritten, then the maxClauseCount will be used instead.
Throws:: IllegalArgumentException - if minimumSimilarity is >= 1 or < 0 or if prefixLength < 0

FuzzyQuery

public FuzzyQuery(Term term,
                  float minimumSimilarity,
                  int prefixLength)

Calls FuzzyQuery(term, minimumSimilarity, prefixLength, Integer.MAX_VALUE).

FuzzyQuery

public FuzzyQuery(Term term,
                  float minimumSimilarity)

Calls FuzzyQuery(term, minimumSimilarity, 0, Integer.MAX_VALUE).

FuzzyQuery

public FuzzyQuery(Term term)

Calls FuzzyQuery(term, 0.5f, 0, Integer.MAX_VALUE).

Method Detail

getMinSimilarity

public float getMinSimilarity()

Returns the minimum similarity that is required for this query to match.

Returns:: float value between 0.0 and 1.0

getPrefixLength

public int getPrefixLength()

Returns the non-fuzzy prefix length. This is the number of characters at the start of a term that must be identical (not fuzzy) to the query term if the query is to match that term.

getEnum

protected FilteredTermEnum getEnum(IndexReader reader)
                            throws IOException

Description copied from class: MultiTermQuery

Construct the enumeration to be used, expanding the pattern term.

Specified by:: getEnum in class MultiTermQuery

Throws:: IOException

getTerm

public Term getTerm()

Returns the pattern term.

toString

public String toString(String field)

Description copied from class: Query

Prints a query to a string, with field assumed to be the default field and omitted.

The representation used is one that is supposed to be readable by QueryParser. However, there are the following limitations:

If the query was created by the parser, the printed representation may not be exactly what was parsed. For example, characters that need to be escaped will be represented without the required backslash.
Some of the more complicated queries (e.g. span queries) don't have a representation that can be parsed by QueryParser.

Specified by:: toString in class Query

hashCode

public int hashCode()

Overrides:: hashCode in class MultiTermQuery

equals

public boolean equals(Object obj)

Overrides:: equals in class MultiTermQuery

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lucene.search Class FuzzyQuery

defaultMinSimilarity

defaultPrefixLength

defaultMaxExpansions

term

FuzzyQuery

FuzzyQuery

FuzzyQuery

FuzzyQuery

getMinSimilarity

getPrefixLength

getEnum

getTerm

toString

hashCode

equals

org.apache.lucene.search
Class FuzzyQuery