org.apache.poi.xwpf.extractor
Class XWPFWordExtractor

java.lang.Object
  extended by org.apache.poi.POITextExtractor
      extended by org.apache.poi.POIXMLTextExtractor
          extended by org.apache.poi.xwpf.extractor.XWPFWordExtractor

public class XWPFWordExtractor
extends POIXMLTextExtractor

Helper class to extract text from an OOXML Word file


Field Summary
static XWPFRelation[] SUPPORTED_TYPES
           
 
Constructor Summary
XWPFWordExtractor(OPCPackage container)
           
XWPFWordExtractor(XWPFDocument document)
           
 
Method Summary
 java.lang.String getText()
          Retrieves all the text from the document.
static void main(java.lang.String[] args)
           
 void setFetchHyperlinks(boolean fetch)
          Should we also fetch the hyperlinks, when fetching the text content? Default is to only output the hyperlink label, and not the contents
 
Methods inherited from class org.apache.poi.POIXMLTextExtractor
getCoreProperties, getCustomProperties, getDocument, getExtendedProperties, getMetadataTextExtractor, getPackage
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

SUPPORTED_TYPES

public static final XWPFRelation[] SUPPORTED_TYPES
Constructor Detail

XWPFWordExtractor

public XWPFWordExtractor(OPCPackage container)
                  throws org.apache.xmlbeans.XmlException,
                         OpenXML4JException,
                         java.io.IOException
Throws:
org.apache.xmlbeans.XmlException
OpenXML4JException
java.io.IOException

XWPFWordExtractor

public XWPFWordExtractor(XWPFDocument document)
Method Detail

setFetchHyperlinks

public void setFetchHyperlinks(boolean fetch)
Should we also fetch the hyperlinks, when fetching the text content? Default is to only output the hyperlink label, and not the contents


main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Throws:
java.lang.Exception

getText

public java.lang.String getText()
Description copied from class: POITextExtractor
Retrieves all the text from the document. How cells, paragraphs etc are separated in the text is implementation specific - see the javadocs for a specific project for details.

Specified by:
getText in class POITextExtractor
Returns:
All the text from the document


Copyright 2012 The Apache Software Foundation or its licensors, as applicable.