org.apache.nutch.fetcher
Class FetcherJob.FetcherMapper

java.lang.Object
  extended by org.apache.hadoop.mapreduce.Mapper<K1,V1,K2,V2>
      extended by org.apache.gora.mapreduce.GoraMapper<String,WebPage,IntWritable,FetchEntry>
          extended by org.apache.nutch.fetcher.FetcherJob.FetcherMapper
Enclosing class:
FetcherJob

public static class FetcherJob.FetcherMapper
extends org.apache.gora.mapreduce.GoraMapper<String,WebPage,IntWritable,FetchEntry>

Mapper class for Fetcher.

This class reads the random integer written by GeneratorJob as its key while outputting the actual key and value arguments through a FetchEntry instance.

This approach (combined with the use of PartitionUrlByHost) makes sure that Fetcher is still polite while also randomizing the key order. If one host has a huge number of URLs in your table while other hosts have not, FetcherReducer will not be stuck on one host but process URLs from other hosts as well.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Mapper
Mapper.Context
 
Constructor Summary
FetcherJob.FetcherMapper()
           
 
Method Summary
protected  void map(String key, WebPage page, Mapper.Context context)
           
protected  void setup(Mapper.Context context)
           
 
Methods inherited from class org.apache.gora.mapreduce.GoraMapper
initMapperJob, initMapperJob, initMapperJob, initMapperJob, initMapperJob
 
Methods inherited from class org.apache.hadoop.mapreduce.Mapper
cleanup, run
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FetcherJob.FetcherMapper

public FetcherJob.FetcherMapper()
Method Detail

setup

protected void setup(Mapper.Context context)
Overrides:
setup in class Mapper<String,WebPage,IntWritable,FetchEntry>

map

protected void map(String key,
                   WebPage page,
                   Mapper.Context context)
            throws IOException,
                   InterruptedException
Overrides:
map in class Mapper<String,WebPage,IntWritable,FetchEntry>
Throws:
IOException
InterruptedException


Copyright © 2012 The Apache Software Foundation