org.apache.nutch.protocol.ftp
Class Client

java.lang.Object
  extended by org.apache.commons.net.SocketClient
      extended by org.apache.commons.net.telnet.TelnetClient
          extended by org.apache.commons.net.ftp.FTP
              extended by org.apache.nutch.protocol.ftp.Client

public class Client
extends org.apache.commons.net.ftp.FTP

Client.java encapsulates functionalities necessary for nutch to get dir list and retrieve file from an FTP server. This class takes care of all low level details of interacting with an FTP server and provides a convenient higher level interface. Modified from FtpClient.java in apache commons-net. Notes by John Xing: ftp server implementations are hardly uniform and none seems to follow RFCs whole-heartedly. We have no choice, but assume common denominator as following: (1) Use stream mode for data tranfer. Block mode will be better for multiple file downloading and partial file downloading. However not every ftpd has block mode support. (2) Use passive mode for data connection. So nutch will work if we run behind firewall. (3) Data connection is opened/closed per ftp command for the reasons listed in (1). There are ftp servers out there, when partial downloading is enforeced by closing data channel socket on our client side, the server side immediately closes control channel (socket). Our codes deal with such a bad behavior. (4) LIST is used to obtain remote file attributes if possible. MDTM & SIZE would be nice, but not as ubiquitously implemented as LIST. (5) Avoid using ABOR in single thread? Do not use it at all. About exceptions: Some specific exceptions are re-thrown as one of FtpException*.java In fact, each function throws FtpException*.java or pass IOException.

Author:
John Xing

Field Summary
protected static int TERMINAL_TYPE
           
protected static int TERMINAL_TYPE_IS
           
protected static int TERMINAL_TYPE_SEND
           
 
Fields inherited from class org.apache.commons.net.ftp.FTP
_commandSupport_, ASCII_FILE_TYPE, BINARY_FILE_TYPE, BLOCK_TRANSFER_MODE, CARRIAGE_CONTROL_TEXT_FORMAT, COMPRESSED_TRANSFER_MODE, DEFAULT_CONTROL_ENCODING, DEFAULT_DATA_PORT, DEFAULT_PORT, EBCDIC_FILE_TYPE, FILE_STRUCTURE, IMAGE_FILE_TYPE, LOCAL_FILE_TYPE, NON_PRINT_TEXT_FORMAT, PAGE_STRUCTURE, RECORD_STRUCTURE, STREAM_TRANSFER_MODE, TELNET_TEXT_FORMAT
 
Fields inherited from class org.apache.commons.net.telnet.TelnetClient
readerThread
 
Fields inherited from class org.apache.commons.net.SocketClient
_defaultPort_, _input_, _isConnected_, _output_, _socket_, _socketFactory_, _timeout_, NETASCII_EOL
 
Constructor Summary
Client()
           
 
Method Summary
protected  Socket __openPassiveDataConnection(int command, String arg)
           
 void disconnect()
          Closes the connection to the FTP server and restores connection parameters to the default values.
 String getSystemName()
          Fetches the system type name from the server and returns the string.
 boolean isRemoteVerificationEnabled()
          Return whether or not verification of the remote host participating in data connections is enabled.
 boolean login(String username, String password)
          Login to the FTP server using the provided username and password.
 boolean logout()
          Logout of the FTP server by sending the QUIT command.
 void retrieveFile(String path, OutputStream os, int limit)
           
 void retrieveList(String path, List entries, int limit, org.apache.commons.net.ftp.FTPFileEntryParser parser)
           
 boolean sendNoOp()
          Sends a NOOP command to the FTP server.
 void setDataTimeout(int timeout)
          Sets the timeout in milliseconds to use for data connection.
 boolean setFileType(int fileType)
          Sets the file type to be transferred.
 void setRemoteVerificationEnabled(boolean enable)
          Enable or disable verification that the remote host taking part of a data connection is the same as the host to which the control connection is attached.
 
Methods inherited from class org.apache.commons.net.ftp.FTP
_connectAction_, abor, acct, addProtocolCommandListener, allo, allo, appe, cdup, cwd, dele, getControlEncoding, getReply, getReplyCode, getReplyString, getReplyStrings, help, help, list, list, mkd, mode, nlst, nlst, noop, pass, pasv, port, pwd, quit, rein, removeProtocolCommandListener, rest, retr, rmd, rnfr, rnto, sendCommand, sendCommand, sendCommand, sendCommand, setControlEncoding, site, smnt, stat, stat, stor, stou, stou, stru, syst, type, type, user
 
Methods inherited from class org.apache.commons.net.telnet.TelnetClient
addOptionHandler, deleteOptionHandler, getInputStream, getLocalOptionState, getOutputStream, getReaderThread, getRemoteOptionState, registerNotifHandler, registerSpyStream, sendAYT, setReaderThread, stopSpyStream, unregisterNotifHandler
 
Methods inherited from class org.apache.commons.net.SocketClient
connect, connect, connect, connect, connect, connect, getDefaultPort, getDefaultTimeout, getLocalAddress, getLocalPort, getRemoteAddress, getRemotePort, getSoLinger, getSoTimeout, getTcpNoDelay, isConnected, setDefaultPort, setDefaultTimeout, setSocketFactory, setSoLinger, setSoTimeout, setTcpNoDelay, verifyRemote
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

TERMINAL_TYPE

protected static final int TERMINAL_TYPE
See Also:
Constant Field Values

TERMINAL_TYPE_SEND

protected static final int TERMINAL_TYPE_SEND
See Also:
Constant Field Values

TERMINAL_TYPE_IS

protected static final int TERMINAL_TYPE_IS
See Also:
Constant Field Values
Constructor Detail

Client

public Client()
Method Detail

__openPassiveDataConnection

protected Socket __openPassiveDataConnection(int command,
                                             String arg)
                                      throws IOException,
                                             FtpExceptionCanNotHaveDataConnection
Throws:
IOException
FtpExceptionCanNotHaveDataConnection

setDataTimeout

public void setDataTimeout(int timeout)
Sets the timeout in milliseconds to use for data connection. set immediately after opening the data connection.


disconnect

public void disconnect()
                throws IOException
Closes the connection to the FTP server and restores connection parameters to the default values.

Overrides:
disconnect in class org.apache.commons.net.ftp.FTP
Throws:
IOException - If an error occurs while disconnecting.

setRemoteVerificationEnabled

public void setRemoteVerificationEnabled(boolean enable)
Enable or disable verification that the remote host taking part of a data connection is the same as the host to which the control connection is attached. The default is for verification to be enabled. You may set this value at any time, whether the FTPClient is currently connected or not.

Parameters:
enable - True to enable verification, false to disable verification.

isRemoteVerificationEnabled

public boolean isRemoteVerificationEnabled()
Return whether or not verification of the remote host participating in data connections is enabled. The default behavior is for verification to be enabled.

Returns:
True if verification is enabled, false if not.

login

public boolean login(String username,
                     String password)
              throws IOException
Login to the FTP server using the provided username and password.

Parameters:
username - The username to login under.
password - The password to use.
Returns:
True if successfully completed, false if not.
Throws:
org.apache.commons.net.ftp.FTPConnectionClosedException - If the FTP server prematurely closes the connection as a result of the client being idle or some other reason causing the server to send FTP reply code 421. This exception may be caught either as an IOException or independently as itself.
IOException - If an I/O error occurs while either sending a command to the server or receiving a reply from the server.

logout

public boolean logout()
               throws IOException
Logout of the FTP server by sending the QUIT command.

Returns:
True if successfully completed, false if not.
Throws:
org.apache.commons.net.ftp.FTPConnectionClosedException - If the FTP server prematurely closes the connection as a result of the client being idle or some other reason causing the server to send FTP reply code 421. This exception may be caught either as an IOException or independently as itself.
IOException - If an I/O error occurs while either sending a command to the server or receiving a reply from the server.

retrieveList

public void retrieveList(String path,
                         List entries,
                         int limit,
                         org.apache.commons.net.ftp.FTPFileEntryParser parser)
                  throws IOException,
                         FtpExceptionCanNotHaveDataConnection,
                         FtpExceptionUnknownForcedDataClose,
                         FtpExceptionControlClosedByForcedDataClose
Throws:
IOException
FtpExceptionCanNotHaveDataConnection
FtpExceptionUnknownForcedDataClose
FtpExceptionControlClosedByForcedDataClose

retrieveFile

public void retrieveFile(String path,
                         OutputStream os,
                         int limit)
                  throws IOException,
                         FtpExceptionCanNotHaveDataConnection,
                         FtpExceptionUnknownForcedDataClose,
                         FtpExceptionControlClosedByForcedDataClose
Throws:
IOException
FtpExceptionCanNotHaveDataConnection
FtpExceptionUnknownForcedDataClose
FtpExceptionControlClosedByForcedDataClose

setFileType

public boolean setFileType(int fileType)
                    throws IOException
Sets the file type to be transferred. This should be one of FTP.ASCII_FILE_TYPE , FTP.IMAGE_FILE_TYPE , etc. The file type only needs to be set when you want to change the type. After changing it, the new type stays in effect until you change it again. The default file type is FTP.ASCII_FILE_TYPE if this method is never called.

Parameters:
fileType - The _FILE_TYPE constant indcating the type of file.
Returns:
True if successfully completed, false if not.
Throws:
org.apache.commons.net.ftp.FTPConnectionClosedException - If the FTP server prematurely closes the connection as a result of the client being idle or some other reason causing the server to send FTP reply code 421. This exception may be caught either as an IOException or independently as itself.
IOException - If an I/O error occurs while either sending a command to the server or receiving a reply from the server.

getSystemName

public String getSystemName()
                     throws IOException,
                            FtpExceptionBadSystResponse
Fetches the system type name from the server and returns the string. This value is cached for the duration of the connection after the first call to this method. In other words, only the first time that you invoke this method will it issue a SYST command to the FTP server. FTPClient will remember the value and return the cached value until a call to disconnect.

Returns:
The system type name obtained from the server. null if the information could not be obtained.
Throws:
org.apache.commons.net.ftp.FTPConnectionClosedException - If the FTP server prematurely closes the connection as a result of the client being idle or some other reason causing the server to send FTP reply code 421. This exception may be caught either as an IOException or independently as itself.
IOException - If an I/O error occurs while either sending a command to the server or receiving a reply from the server.
FtpExceptionBadSystResponse

sendNoOp

public boolean sendNoOp()
                 throws IOException
Sends a NOOP command to the FTP server. This is useful for preventing server timeouts.

Returns:
True if successfully completed, false if not.
Throws:
org.apache.commons.net.ftp.FTPConnectionClosedException - If the FTP server prematurely closes the connection as a result of the client being idle or some other reason causing the server to send FTP reply code 421. This exception may be caught either as an IOException or independently as itself.
IOException - If an I/O error occurs while either sending a command to the server or receiving a reply from the server.


Copyright © 2012 The Apache Software Foundation