|
||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||
See:
Description
| Interface Summary | |
|---|---|
| FetchSchedule | This interface defines the contract for implementations that manipulate fetch times and re-fetch intervals. |
| Class Summary | |
|---|---|
| AbstractFetchSchedule | This class provides common methods for implementations of
FetchSchedule. |
| AdaptiveFetchSchedule | This class implements an adaptive re-fetch algorithm. |
| Crawler | |
| CrawlStatus | |
| DbUpdateMapper | |
| DbUpdateReducer | |
| DbUpdaterJob | |
| DefaultFetchSchedule | This class implements the default re-fetch schedule. |
| FetchScheduleFactory | Creates and caches a FetchSchedule implementation. |
| GeneratorJob | |
| GeneratorJob.SelectorEntry | |
| GeneratorJob.SelectorEntryComparator | |
| GeneratorMapper | |
| GeneratorReducer | Reduce class for generate The #reduce() method write a random integer to all generated URLs. |
| InjectorJob | This class takes a flat file of URLs and adds them to the of pages to be crawled. |
| InjectorJob.InjectorMapper | |
| InjectorJob.UrlMapper | |
| MD5Signature | Default implementation of a page signature. |
| NutchWritable | |
| Signature | |
| SignatureComparator | |
| SignatureFactory | Factory class, which instantiates a Signature implementation according to the current Configuration configuration. |
| TextProfileSignature | An implementation of a page signature. |
| URLPartitioner | Partition urls by host, domain name or IP depending on the value of the parameter 'partition.url.mode' which can be 'byHost', 'byDomain' or 'byIP' |
| URLPartitioner.FetchEntryPartitioner | |
| URLPartitioner.SelectorEntryPartitioner | |
| URLWebPage | |
| UrlWithScore | A writable comparable container for an url with score. |
| UrlWithScore.UrlOnlyPartitioner | A partitioner by {url}. |
| UrlWithScore.UrlScoreComparator | Compares by {url,score}. |
| UrlWithScore.UrlScoreComparator.UrlOnlyComparator | Compares by {url}. |
| WebTableReader | Displays information about the entries of the webtable |
| WebTableReader.WebTableRegexMapper | Filters the entries from the table based on a regex |
| WebTableReader.WebTableStatCombiner | |
| WebTableReader.WebTableStatMapper | |
| WebTableReader.WebTableStatReducer | |
Crawl control code.
|
||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||