org.apache.nutch.crawl
Class MD5Signature

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.nutch.crawl.Signature
          extended by org.apache.nutch.crawl.MD5Signature
All Implemented Interfaces:
Configurable

public class MD5Signature
extends Signature

Default implementation of a page signature. It calculates an MD5 hash of the raw binary content of a page. In case there is no content, it calculates a hash from the page's URL.

Author:
Andrzej Bialecki <ab@getopt.org>

Constructor Summary
MD5Signature()
           
 
Method Summary
 byte[] calculate(WebPage page)
           
 Collection<WebPage.Field> getFields()
           
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

MD5Signature

public MD5Signature()
Method Detail

calculate

public byte[] calculate(WebPage page)
Specified by:
calculate in class Signature

getFields

public Collection<WebPage.Field> getFields()
Specified by:
getFields in class Signature


Copyright © 2012 The Apache Software Foundation