Package org.apache.nutch.host

Class Summary
HostDb A caching wrapper for the host datastore.
HostDbReader Display entries from the hostDB.
HostDbUpdateJob Scans the web table and create host entries for each unique host.
HostDbUpdateJob.Mapper Maps each WebPage to a host key.
HostDbUpdateReducer Combines all WebPages with the same host key to create a Host object, with some statistics.
HostInjectorJob Creates or updates an existing host table from a text file.
The files contain one host name per line, optionally followed by custom metadata separated by tabs with the metadata key is separated from the corresponding value by '='.
HostInjectorJob.UrlMapper  
 



Copyright © 2012 The Apache Software Foundation