org.apache.nutch.parse
Class HTMLMetaTags

java.lang.Object
  extended by org.apache.nutch.parse.HTMLMetaTags

public class HTMLMetaTags
extends Object

This class holds the information about HTML "meta" tags extracted from a page. Some special tags have convenience methods for easy checking.


Constructor Summary
HTMLMetaTags()
           
 
Method Summary
 URL getBaseHref()
          A convenience method.
 Properties getGeneralTags()
          Returns all collected values of the general meta tags.
 Properties getHttpEquivTags()
          Returns all collected values of the "http-equiv" meta tags.
 boolean getNoCache()
          A convenience method.
 boolean getNoFollow()
          A convenience method.
 boolean getNoIndex()
          A convenience method.
 boolean getRefresh()
          A convenience method.
 URL getRefreshHref()
          A convenience method.
 int getRefreshTime()
          A convenience method.
 void reset()
          Sets all boolean values to false.
 void setBaseHref(URL baseHref)
          Sets the baseHref.
 void setNoCache()
          Sets noCache to true.
 void setNoFollow()
          Sets noFollow to true.
 void setNoIndex()
          Sets noIndex to true.
 void setRefresh(boolean refresh)
          Sets refresh to the supplied value.
 void setRefreshHref(URL refreshHref)
          Sets the refreshHref.
 void setRefreshTime(int refreshTime)
          Sets the refreshTime.
 String toString()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

HTMLMetaTags

public HTMLMetaTags()
Method Detail

reset

public void reset()
Sets all boolean values to false. Clears all other tags.


setNoFollow

public void setNoFollow()
Sets noFollow to true.


setNoIndex

public void setNoIndex()
Sets noIndex to true.


setNoCache

public void setNoCache()
Sets noCache to true.


setRefresh

public void setRefresh(boolean refresh)
Sets refresh to the supplied value.


setBaseHref

public void setBaseHref(URL baseHref)
Sets the baseHref.


setRefreshHref

public void setRefreshHref(URL refreshHref)
Sets the refreshHref.


setRefreshTime

public void setRefreshTime(int refreshTime)
Sets the refreshTime.


getNoIndex

public boolean getNoIndex()
A convenience method. Returns the current value of noIndex.


getNoFollow

public boolean getNoFollow()
A convenience method. Returns the current value of noFollow.


getNoCache

public boolean getNoCache()
A convenience method. Returns the current value of noCache.


getRefresh

public boolean getRefresh()
A convenience method. Returns the current value of refresh.


getBaseHref

public URL getBaseHref()
A convenience method. Returns the baseHref, if set, or null otherwise.


getRefreshHref

public URL getRefreshHref()
A convenience method. Returns the refreshHref, if set, or null otherwise. The value may be invalid if getRefresh()returns false.


getRefreshTime

public int getRefreshTime()
A convenience method. Returns the current value of refreshTime. The value may be invalid if getRefresh()returns false.


getGeneralTags

public Properties getGeneralTags()
Returns all collected values of the general meta tags. Property names are tag names, property values are "content" values.


getHttpEquivTags

public Properties getHttpEquivTags()
Returns all collected values of the "http-equiv" meta tags. Property names are tag names, property values are "content" values.


toString

public String toString()
Overrides:
toString in class Object


Copyright © 2012 The Apache Software Foundation