org.apache.nutch.protocol.ftp
Class Ftp

java.lang.Object
  extended by org.apache.nutch.protocol.ftp.Ftp
All Implemented Interfaces:
Configurable, FieldPluggable, Pluggable, Protocol

public class Ftp
extends Object
implements Protocol

Ftp.java deals with ftp: scheme. Configurable parameters are defined under "FTP properties" section in ./conf/nutch-default.xml or similar.

Author:
John Xing

Field Summary
static org.slf4j.Logger LOG
           
 
Fields inherited from interface org.apache.nutch.protocol.Protocol
CHECK_BLOCKING, CHECK_ROBOTS, X_POINT_ID
 
Constructor Summary
Ftp()
           
 
Method Summary
protected  void finalize()
           
 Configuration getConf()
           
 Collection<WebPage.Field> getFields()
           
 ProtocolOutput getProtocolOutput(String url, WebPage page)
          Returns the Content for a fetchlist entry.
 RobotRules getRobotRules(String url, WebPage page)
          Retrieve robot rules applicable for this url.
static void main(String[] args)
          For debugging.
 void setConf(Configuration conf)
           
 void setFollowTalk(boolean followTalk)
          Set followTalk
 void setKeepConnection(boolean keepConnection)
          Set keepConnection
 void setMaxContentLength(int length)
          Set the point at which content is truncated.
 void setTimeout(int to)
          Set the timeout.
 
Methods inherited from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final org.slf4j.Logger LOG
Constructor Detail

Ftp

public Ftp()
Method Detail

setTimeout

public void setTimeout(int to)
Set the timeout.


setMaxContentLength

public void setMaxContentLength(int length)
Set the point at which content is truncated.


setFollowTalk

public void setFollowTalk(boolean followTalk)
Set followTalk


setKeepConnection

public void setKeepConnection(boolean keepConnection)
Set keepConnection


getProtocolOutput

public ProtocolOutput getProtocolOutput(String url,
                                        WebPage page)
Description copied from interface: Protocol
Returns the Content for a fetchlist entry.

Specified by:
getProtocolOutput in interface Protocol

finalize

protected void finalize()
Overrides:
finalize in class Object

setConf

public void setConf(Configuration conf)
Specified by:
setConf in interface Configurable

getConf

public Configuration getConf()
Specified by:
getConf in interface Configurable

getRobotRules

public RobotRules getRobotRules(String url,
                                WebPage page)
Description copied from interface: Protocol
Retrieve robot rules applicable for this url.

Specified by:
getRobotRules in interface Protocol
Parameters:
url - url to check
Returns:
robot rules (specific for this url or default), never null

main

public static void main(String[] args)
                 throws Exception
For debugging.

Throws:
Exception

getFields

public Collection<WebPage.Field> getFields()
Specified by:
getFields in interface FieldPluggable


Copyright © 2012 The Apache Software Foundation