org.apache.nutch.parse
Interface Parser
- All Superinterfaces:
- Configurable, FieldPluggable, Pluggable
- All Known Implementing Classes:
- ExtParser, FeedParser, HtmlParser, JSParseFilter, SWFParser, TikaParser, ZipParser
public interface Parser
- extends FieldPluggable, Configurable
A parser for content generated by a Protocol
implementation. This interface is implemented by extensions. Nutch's core
contains no page parsing code.
X_POINT_ID
static final String X_POINT_ID
- The name of the extension point.
getParse
Parse getParse(String url,
WebPage page)
This method parses content in WebPage instance
- Parameters:
url
- Page's URLpage
-
Copyright © 2012 The Apache Software Foundation