org.apache.lucene.search.highlight
Class WeightedSpanTermExtractor

java.lang.Object
  extended by org.apache.lucene.search.highlight.WeightedSpanTermExtractor

public class WeightedSpanTermExtractor
extends Object

Class used to extract WeightedSpanTerms from a Query based on whether Terms from the Query are contained in a supplied TokenStream.


Nested Class Summary
protected static class WeightedSpanTermExtractor.PositionCheckingMap<K>
          This class makes sure that if both position sensitive and insensitive versions of the same term are added, the position insensitive one wins.
 
Constructor Summary
WeightedSpanTermExtractor()
           
WeightedSpanTermExtractor(String defaultField)
           
 
Method Summary
protected  void collectSpanQueryFields(SpanQuery spanQuery, Set<String> fieldNames)
           
protected  void extract(Query query, Map<String,WeightedSpanTerm> terms)
          Fills a Map with <@link WeightedSpanTerm>s using the terms from the supplied Query.
protected  void extractUnknownQuery(Query query, Map<String,WeightedSpanTerm> terms)
           
protected  void extractWeightedSpanTerms(Map<String,WeightedSpanTerm> terms, SpanQuery spanQuery)
          Fills a Map with <@link WeightedSpanTerm>s using the terms from the supplied SpanQuery.
protected  void extractWeightedTerms(Map<String,WeightedSpanTerm> terms, Query query)
          Fills a Map with <@link WeightedSpanTerm>s using the terms from the supplied Query.
protected  boolean fieldNameComparator(String fieldNameToCheck)
          Necessary to implement matches for queries against defaultField
 boolean getExpandMultiTermQuery()
           
protected  IndexReader getReaderForField(String field)
           
 TokenStream getTokenStream()
           
 Map<String,WeightedSpanTerm> getWeightedSpanTerms(Query query, TokenStream tokenStream)
          Creates a Map of WeightedSpanTerms from the given Query and TokenStream.
 Map<String,WeightedSpanTerm> getWeightedSpanTerms(Query query, TokenStream tokenStream, String fieldName)
          Creates a Map of WeightedSpanTerms from the given Query and TokenStream.
 Map<String,WeightedSpanTerm> getWeightedSpanTermsWithScores(Query query, TokenStream tokenStream, String fieldName, IndexReader reader)
          Creates a Map of WeightedSpanTerms from the given Query and TokenStream.
 boolean isCachedTokenStream()
           
protected  boolean mustRewriteQuery(SpanQuery spanQuery)
           
 void setExpandMultiTermQuery(boolean expandMultiTermQuery)
           
protected  void setMaxDocCharsToAnalyze(int maxDocCharsToAnalyze)
           
 void setWrapIfNotCachingTokenFilter(boolean wrap)
          By default, TokenStreams that are not of the type CachingTokenFilter are wrapped in a CachingTokenFilter to ensure an efficient reset - if you are already using a different caching TokenStream impl and you don't want it to be wrapped, set this to false.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

WeightedSpanTermExtractor

public WeightedSpanTermExtractor()

WeightedSpanTermExtractor

public WeightedSpanTermExtractor(String defaultField)
Method Detail

extract

protected void extract(Query query,
                       Map<String,WeightedSpanTerm> terms)
                throws IOException
Fills a Map with <@link WeightedSpanTerm>s using the terms from the supplied Query.

Parameters:
query - Query to extract Terms from
terms - Map to place created WeightedSpanTerms in
Throws:
IOException

extractUnknownQuery

protected void extractUnknownQuery(Query query,
                                   Map<String,WeightedSpanTerm> terms)
                            throws IOException
Throws:
IOException

extractWeightedSpanTerms

protected void extractWeightedSpanTerms(Map<String,WeightedSpanTerm> terms,
                                        SpanQuery spanQuery)
                                 throws IOException
Fills a Map with <@link WeightedSpanTerm>s using the terms from the supplied SpanQuery.

Parameters:
terms - Map to place created WeightedSpanTerms in
spanQuery - SpanQuery to extract Terms from
Throws:
IOException

extractWeightedTerms

protected void extractWeightedTerms(Map<String,WeightedSpanTerm> terms,
                                    Query query)
                             throws IOException
Fills a Map with <@link WeightedSpanTerm>s using the terms from the supplied Query.

Parameters:
terms - Map to place created WeightedSpanTerms in
query - Query to extract Terms from
Throws:
IOException

fieldNameComparator

protected boolean fieldNameComparator(String fieldNameToCheck)
Necessary to implement matches for queries against defaultField


getReaderForField

protected IndexReader getReaderForField(String field)
                                 throws IOException
Throws:
IOException

getWeightedSpanTerms

public Map<String,WeightedSpanTerm> getWeightedSpanTerms(Query query,
                                                         TokenStream tokenStream)
                                                  throws IOException
Creates a Map of WeightedSpanTerms from the given Query and TokenStream.

Parameters:
query - that caused hit
tokenStream - of text to be highlighted
Returns:
Map containing WeightedSpanTerms
Throws:
IOException

getWeightedSpanTerms

public Map<String,WeightedSpanTerm> getWeightedSpanTerms(Query query,
                                                         TokenStream tokenStream,
                                                         String fieldName)
                                                  throws IOException
Creates a Map of WeightedSpanTerms from the given Query and TokenStream.

Parameters:
query - that caused hit
tokenStream - of text to be highlighted
fieldName - restricts Term's used based on field name
Returns:
Map containing WeightedSpanTerms
Throws:
IOException

getWeightedSpanTermsWithScores

public Map<String,WeightedSpanTerm> getWeightedSpanTermsWithScores(Query query,
                                                                   TokenStream tokenStream,
                                                                   String fieldName,
                                                                   IndexReader reader)
                                                            throws IOException
Creates a Map of WeightedSpanTerms from the given Query and TokenStream. Uses a supplied IndexReader to properly weight terms (for gradient highlighting).

Parameters:
query - that caused hit
tokenStream - of text to be highlighted
fieldName - restricts Term's used based on field name
reader - to use for scoring
Returns:
Map of WeightedSpanTerms with quasi tf/idf scores
Throws:
IOException

collectSpanQueryFields

protected void collectSpanQueryFields(SpanQuery spanQuery,
                                      Set<String> fieldNames)

mustRewriteQuery

protected boolean mustRewriteQuery(SpanQuery spanQuery)

getExpandMultiTermQuery

public boolean getExpandMultiTermQuery()

setExpandMultiTermQuery

public void setExpandMultiTermQuery(boolean expandMultiTermQuery)

isCachedTokenStream

public boolean isCachedTokenStream()

getTokenStream

public TokenStream getTokenStream()

setWrapIfNotCachingTokenFilter

public void setWrapIfNotCachingTokenFilter(boolean wrap)
By default, TokenStreams that are not of the type CachingTokenFilter are wrapped in a CachingTokenFilter to ensure an efficient reset - if you are already using a different caching TokenStream impl and you don't want it to be wrapped, set this to false.

Parameters:
wrap -

setMaxDocCharsToAnalyze

protected final void setMaxDocCharsToAnalyze(int maxDocCharsToAnalyze)