org.apache.nutch.microformats.reltag
Class RelTagIndexingFilter
java.lang.Object
org.apache.nutch.microformats.reltag.RelTagIndexingFilter
- All Implemented Interfaces:
- Configurable, IndexingFilter, FieldPluggable, Pluggable
public class RelTagIndexingFilter
- extends Object
- implements IndexingFilter
An IndexingFilter
that add tag
field(s) to the document.
- Author:
- Jérôme Charron
- See Also:
-
http://www.microformats.org/wiki/rel-tag
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
RelTagIndexingFilter
public RelTagIndexingFilter()
getFields
public Collection<WebPage.Field> getFields()
- Specified by:
getFields
in interface FieldPluggable
setConf
public void setConf(Configuration conf)
- Specified by:
setConf
in interface Configurable
getConf
public Configuration getConf()
- Specified by:
getConf
in interface Configurable
filter
public NutchDocument filter(NutchDocument doc,
String url,
WebPage page)
throws IndexingException
- Description copied from interface:
IndexingFilter
- Adds fields or otherwise modifies the document that will be indexed for a
parse. Unwanted documents can be removed from indexing by returning a null value.
- Specified by:
filter
in interface IndexingFilter
- Parameters:
doc
- document instance for collecting fieldsurl
- page url
- Returns:
- modified (or a new) document instance, or null (meaning the document
should be discarded)
- Throws:
IndexingException
Copyright © 2012 The Apache Software Foundation