org.apache.lucene.queryParser.complexPhrase
Class ComplexPhraseQueryParser

java.lang.Object
  extended by org.apache.lucene.queryParser.QueryParser
      extended by org.apache.lucene.queryParser.complexPhrase.ComplexPhraseQueryParser
All Implemented Interfaces:
QueryParserConstants

public class ComplexPhraseQueryParser
extends QueryParser

QueryParser which permits complex phrase query syntax eg "(john jon jonathan~) peters*".

Performs potentially multiple passes over Query text to parse any nested logic in PhraseQueries. - First pass takes any PhraseQuery content between quotes and stores for subsequent pass. All other query content is parsed as normal - Second pass parses any stored PhraseQuery content, checking all embedded clauses are referring to the same field and therefore can be rewritten as Span queries. All PhraseQuery clauses are expressed as ComplexPhraseQuery objects

This could arguably be done in one pass using a new QueryParser but here I am working within the constraints of the existing parser as a base class. This currently simply feeds all phrase content through an analyzer to select phrase terms - any "special" syntax such as * ~ * etc are not given special status


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.lucene.queryParser.QueryParser
QueryParser.Operator
 
Field Summary
 
Fields inherited from class org.apache.lucene.queryParser.QueryParser
AND_OPERATOR, jj_nt, OR_OPERATOR, token, token_source
 
Fields inherited from interface org.apache.lucene.queryParser.QueryParserConstants
_ESCAPED_CHAR, _NUM_CHAR, _QUOTED_CHAR, _TERM_CHAR, _TERM_START_CHAR, _WHITESPACE, AND, Boost, CARAT, COLON, DEFAULT, EOF, FUZZY_SLOP, LPAREN, MINUS, NOT, NUMBER, OR, PLUS, PREFIXTERM, QUOTED, RangeEx, RANGEEX_END, RANGEEX_GOOP, RANGEEX_QUOTED, RANGEEX_START, RANGEEX_TO, RangeIn, RANGEIN_END, RANGEIN_GOOP, RANGEIN_QUOTED, RANGEIN_START, RANGEIN_TO, RPAREN, STAR, TERM, tokenImage, WILDTERM
 
Constructor Summary
ComplexPhraseQueryParser(Version matchVersion, String f, Analyzer a)
           
 
Method Summary
protected  Query getFieldQuery(String field, String queryText, int slop)
          Base implementation delegates to QueryParser.getFieldQuery(String,String,boolean).
protected  Query getFuzzyQuery(String field, String termStr, float minSimilarity)
          Factory method for generating a query (similar to QueryParser.getWildcardQuery(java.lang.String, java.lang.String)).
protected  Query getRangeQuery(String field, String part1, String part2, boolean inclusive)
           
protected  Query getWildcardQuery(String field, String termStr)
          Factory method for generating a query.
protected  Query newRangeQuery(String field, String part1, String part2, boolean inclusive)
          Builds a new TermRangeQuery instance
protected  Query newTermQuery(Term term)
          Builds a new TermQuery instance
 Query parse(String query)
          Parses a query string, returning a Query.
 
Methods inherited from class org.apache.lucene.queryParser.QueryParser
addClause, Clause, Conjunction, disable_tracing, enable_tracing, escape, generateParseException, getAllowLeadingWildcard, getAnalyzer, getAutoGeneratePhraseQueries, getBooleanQuery, getBooleanQuery, getDateResolution, getDefaultOperator, getEnablePositionIncrements, getField, getFieldQuery, getFieldQuery, getFuzzyMinSim, getFuzzyPrefixLength, getLocale, getLowercaseExpandedTerms, getMultiTermRewriteMethod, getNextToken, getPhraseSlop, getPrefixQuery, getRangeCollator, getToken, main, Modifiers, newBooleanClause, newBooleanQuery, newFuzzyQuery, newMatchAllDocsQuery, newMultiPhraseQuery, newPhraseQuery, newPrefixQuery, newWildcardQuery, Query, ReInit, ReInit, setAllowLeadingWildcard, setAutoGeneratePhraseQueries, setDateResolution, setDateResolution, setDefaultOperator, setEnablePositionIncrements, setFuzzyMinSim, setFuzzyPrefixLength, setLocale, setLowercaseExpandedTerms, setMultiTermRewriteMethod, setPhraseSlop, setRangeCollator, Term, TopLevelQuery
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ComplexPhraseQueryParser

public ComplexPhraseQueryParser(Version matchVersion,
                                String f,
                                Analyzer a)
Method Detail

getFieldQuery

protected Query getFieldQuery(String field,
                              String queryText,
                              int slop)
Description copied from class: QueryParser
Base implementation delegates to QueryParser.getFieldQuery(String,String,boolean). This method may be overridden, for example, to return a SpanNearQuery instead of a PhraseQuery.

Overrides:
getFieldQuery in class QueryParser

parse

public Query parse(String query)
            throws ParseException
Description copied from class: QueryParser
Parses a query string, returning a Query.

Overrides:
parse in class QueryParser
Parameters:
query - the query string to be parsed.
Throws:
ParseException - if the parsing fails

newTermQuery

protected Query newTermQuery(Term term)
Description copied from class: QueryParser
Builds a new TermQuery instance

Overrides:
newTermQuery in class QueryParser
Parameters:
term - term
Returns:
new TermQuery instance

getWildcardQuery

protected Query getWildcardQuery(String field,
                                 String termStr)
                          throws ParseException
Description copied from class: QueryParser
Factory method for generating a query. Called when parser parses an input term token that contains one or more wildcard characters (? and *), but is not a prefix term token (one that has just a single * character at the end)

Depending on settings, prefix term may be lower-cased automatically. It will not go through the default Analyzer, however, since normal Analyzers are unlikely to work properly with wildcard templates.

Can be overridden by extending classes, to provide custom handling for wildcard queries, which may be necessary due to missing analyzer calls.

Overrides:
getWildcardQuery in class QueryParser
Parameters:
field - Name of the field query will use.
termStr - Term token that contains one or more wild card characters (? or *), but is not simple prefix term
Returns:
Resulting Query built for the term
Throws:
ParseException - throw in overridden method to disallow

getRangeQuery

protected Query getRangeQuery(String field,
                              String part1,
                              String part2,
                              boolean inclusive)
                       throws ParseException
Overrides:
getRangeQuery in class QueryParser
Throws:
ParseException - throw in overridden method to disallow

newRangeQuery

protected Query newRangeQuery(String field,
                              String part1,
                              String part2,
                              boolean inclusive)
Description copied from class: QueryParser
Builds a new TermRangeQuery instance

Overrides:
newRangeQuery in class QueryParser
Parameters:
field - Field
part1 - min
part2 - max
inclusive - true if range is inclusive
Returns:
new TermRangeQuery instance

getFuzzyQuery

protected Query getFuzzyQuery(String field,
                              String termStr,
                              float minSimilarity)
                       throws ParseException
Description copied from class: QueryParser
Factory method for generating a query (similar to QueryParser.getWildcardQuery(java.lang.String, java.lang.String)). Called when parser parses an input term token that has the fuzzy suffix (~) appended.

Overrides:
getFuzzyQuery in class QueryParser
Parameters:
field - Name of the field query will use.
termStr - Term token to use for building term for the query
Returns:
Resulting Query built for the term
Throws:
ParseException - throw in overridden method to disallow