org.apache.nutch.crawl
Class URLPartitioner
java.lang.Object
org.apache.nutch.crawl.URLPartitioner
- All Implemented Interfaces:
- Configurable
public class URLPartitioner
- extends Object
- implements Configurable
Partition urls by host, domain name or IP depending on the value of the
parameter 'partition.url.mode' which can be 'byHost', 'byDomain' or 'byIP'
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
PARTITION_MODE_KEY
public static final String PARTITION_MODE_KEY
- See Also:
- Constant Field Values
PARTITION_MODE_HOST
public static final String PARTITION_MODE_HOST
- See Also:
- Constant Field Values
PARTITION_MODE_DOMAIN
public static final String PARTITION_MODE_DOMAIN
- See Also:
- Constant Field Values
PARTITION_MODE_IP
public static final String PARTITION_MODE_IP
- See Also:
- Constant Field Values
PARTITION_URL_SEED
public static final String PARTITION_URL_SEED
- See Also:
- Constant Field Values
URLPartitioner
public URLPartitioner()
getConf
public Configuration getConf()
- Specified by:
getConf
in interface Configurable
setConf
public void setConf(Configuration conf)
- Specified by:
setConf
in interface Configurable
getPartition
public int getPartition(String urlString,
int numReduceTasks)
Copyright © 2012 The Apache Software Foundation