org.apache.nutch.microformats.reltag
Class RelTagIndexingFilter

java.lang.Object
  extended by org.apache.nutch.microformats.reltag.RelTagIndexingFilter
All Implemented Interfaces:
Configurable, IndexingFilter, FieldPluggable, Pluggable

public class RelTagIndexingFilter
extends Object
implements IndexingFilter

An IndexingFilter that add tag field(s) to the document.

Author:
Jérôme Charron
See Also:
http://www.microformats.org/wiki/rel-tag

Field Summary
 
Fields inherited from interface org.apache.nutch.indexer.IndexingFilter
X_POINT_ID
 
Constructor Summary
RelTagIndexingFilter()
           
 
Method Summary
 NutchDocument filter(NutchDocument doc, String url, WebPage page)
          Adds fields or otherwise modifies the document that will be indexed for a parse.
 Configuration getConf()
           
 Collection<WebPage.Field> getFields()
           
 void setConf(Configuration conf)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RelTagIndexingFilter

public RelTagIndexingFilter()
Method Detail

getFields

public Collection<WebPage.Field> getFields()
Specified by:
getFields in interface FieldPluggable

setConf

public void setConf(Configuration conf)
Specified by:
setConf in interface Configurable

getConf

public Configuration getConf()
Specified by:
getConf in interface Configurable

filter

public NutchDocument filter(NutchDocument doc,
                            String url,
                            WebPage page)
                     throws IndexingException
Description copied from interface: IndexingFilter
Adds fields or otherwise modifies the document that will be indexed for a parse. Unwanted documents can be removed from indexing by returning a null value.

Specified by:
filter in interface IndexingFilter
Parameters:
doc - document instance for collecting fields
url - page url
Returns:
modified (or a new) document instance, or null (meaning the document should be discarded)
Throws:
IndexingException


Copyright © 2012 The Apache Software Foundation