org.apache.nutch.urlfilter.prefix
Class PrefixURLFilter

java.lang.Object
  extended by org.apache.nutch.urlfilter.prefix.PrefixURLFilter
All Implemented Interfaces:
Configurable, URLFilter, Pluggable

public class PrefixURLFilter
extends Object
implements URLFilter

Filters URLs based on a file of URL prefixes. The file is named by (1) property "urlfilter.prefix.file" in ./conf/nutch-default.xml, and (2) attribute "file" in plugin.xml of this plugin Attribute "file" has higher precedence if defined.

The format of this file is one URL prefix per line.


Field Summary
 
Fields inherited from interface org.apache.nutch.net.URLFilter
X_POINT_ID
 
Constructor Summary
PrefixURLFilter()
           
PrefixURLFilter(String stringRules)
           
 
Method Summary
 String filter(String url)
           
 Configuration getConf()
           
static void main(String[] args)
           
 void setConf(Configuration conf)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PrefixURLFilter

public PrefixURLFilter()
                throws IOException
Throws:
IOException

PrefixURLFilter

public PrefixURLFilter(String stringRules)
                throws IOException
Throws:
IOException
Method Detail

filter

public String filter(String url)
Specified by:
filter in interface URLFilter

main

public static void main(String[] args)
                 throws IOException
Throws:
IOException

setConf

public void setConf(Configuration conf)
Specified by:
setConf in interface Configurable

getConf

public Configuration getConf()
Specified by:
getConf in interface Configurable


Copyright © 2012 The Apache Software Foundation