|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
Packages that use Parse | |
---|---|
org.apache.nutch.analysis.lang | Text document language identifier. |
org.apache.nutch.indexer.feed | |
org.apache.nutch.microformats.reltag | A microformats Rel-Tag Parser/Indexer/Querier plugin. |
org.apache.nutch.parse | |
org.apache.nutch.parse.html | An HTML document parsing plugin. |
org.apache.nutch.parse.js | |
org.apache.nutch.parse.tika | |
org.creativecommons.nutch | Sample plugins that parse and index Creative Commons medadata. |
Uses of Parse in org.apache.nutch.analysis.lang |
---|
Methods in org.apache.nutch.analysis.lang that return Parse | |
---|---|
Parse |
HTMLLanguageParser.filter(String url,
WebPage page,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
Scan the HTML document looking at possible indications of content language 1. |
Methods in org.apache.nutch.analysis.lang with parameters of type Parse | |
---|---|
Parse |
HTMLLanguageParser.filter(String url,
WebPage page,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
Scan the HTML document looking at possible indications of content language 1. |
Uses of Parse in org.apache.nutch.indexer.feed |
---|
Methods in org.apache.nutch.indexer.feed with parameters of type Parse | |
---|---|
NutchDocument |
FeedIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
Extracts out the relevant fields: FEED_AUTHOR FEED_TAGS FEED_PUBLISHED FEED_UPDATED FEED And sends them to the Indexer for indexing within the Nutch
index. |
Uses of Parse in org.apache.nutch.microformats.reltag |
---|
Methods in org.apache.nutch.microformats.reltag that return Parse | |
---|---|
Parse |
RelTagParser.filter(String url,
WebPage page,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
|
Methods in org.apache.nutch.microformats.reltag with parameters of type Parse | |
---|---|
Parse |
RelTagParser.filter(String url,
WebPage page,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
|
Uses of Parse in org.apache.nutch.parse |
---|
Methods in org.apache.nutch.parse that return Parse | |
---|---|
Parse |
ParseFilter.filter(String url,
WebPage page,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
Adds metadata or otherwise modifies a parse, given the DOM tree of a page. |
Parse |
ParseFilters.filter(String url,
WebPage page,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
Run all defined filters. |
static Parse |
ParseStatusUtils.getEmptyParse(Exception e,
Configuration conf)
|
static Parse |
ParseStatusUtils.getEmptyParse(int minorCode,
String message,
Configuration conf)
|
Parse |
Parser.getParse(String url,
WebPage page)
This method parses content in WebPage instance |
Parse |
ParseUtil.parse(String url,
WebPage page)
Performs a parse by iterating through a List of preferred Parser s
until a successful parse is performed and a Parse object is
returned. |
Methods in org.apache.nutch.parse with parameters of type Parse | |
---|---|
Parse |
ParseFilter.filter(String url,
WebPage page,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
Adds metadata or otherwise modifies a parse, given the DOM tree of a page. |
Parse |
ParseFilters.filter(String url,
WebPage page,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
Run all defined filters. |
Uses of Parse in org.apache.nutch.parse.html |
---|
Methods in org.apache.nutch.parse.html that return Parse | |
---|---|
Parse |
HtmlParser.getParse(String url,
WebPage page)
|
Uses of Parse in org.apache.nutch.parse.js |
---|
Methods in org.apache.nutch.parse.js that return Parse | |
---|---|
Parse |
JSParseFilter.filter(String url,
WebPage page,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
|
Parse |
JSParseFilter.getParse(String url,
WebPage page)
|
Methods in org.apache.nutch.parse.js with parameters of type Parse | |
---|---|
Parse |
JSParseFilter.filter(String url,
WebPage page,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
|
Uses of Parse in org.apache.nutch.parse.tika |
---|
Methods in org.apache.nutch.parse.tika that return Parse | |
---|---|
Parse |
TikaParser.getParse(String url,
WebPage page)
|
Uses of Parse in org.creativecommons.nutch |
---|
Methods in org.creativecommons.nutch that return Parse | |
---|---|
Parse |
CCParseFilter.filter(String url,
WebPage page,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
Adds metadata or otherwise modifies a parse of an HTML document, given the DOM tree of a page. |
Methods in org.creativecommons.nutch with parameters of type Parse | |
---|---|
Parse |
CCParseFilter.filter(String url,
WebPage page,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
Adds metadata or otherwise modifies a parse of an HTML document, given the DOM tree of a page. |
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |