org.apache.nutch.parse
Interface Parser

All Superinterfaces:
Configurable, FieldPluggable, Pluggable
All Known Implementing Classes:
ExtParser, FeedParser, HtmlParser, JSParseFilter, SWFParser, TikaParser, ZipParser

public interface Parser
extends FieldPluggable, Configurable

A parser for content generated by a Protocol implementation. This interface is implemented by extensions. Nutch's core contains no page parsing code.


Field Summary
static String X_POINT_ID
          The name of the extension point.
 
Method Summary
 Parse getParse(String url, WebPage page)
           This method parses content in WebPage instance
 
Methods inherited from interface org.apache.nutch.plugin.FieldPluggable
getFields
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Field Detail

X_POINT_ID

static final String X_POINT_ID
The name of the extension point.

Method Detail

getParse

Parse getParse(String url,
               WebPage page)

This method parses content in WebPage instance

Parameters:
url - Page's URL
page -


Copyright © 2012 The Apache Software Foundation