org.apache.poi.hwpf.extractor
Class Word6Extractor

java.lang.Object
  extended by org.apache.poi.POITextExtractor
      extended by org.apache.poi.POIOLE2TextExtractor
          extended by org.apache.poi.hwpf.extractor.Word6Extractor

public final class Word6Extractor
extends POIOLE2TextExtractor

Class to extract the text from old (Word 6 / Word 95) Word Documents. This should only be used on the older files, for most uses you should call WordExtractor which deals properly with HWPF.

Author:
Nick Burch

Field Summary
 
Fields inherited from class org.apache.poi.POITextExtractor
document
 
Constructor Summary
Word6Extractor(DirectoryNode dir)
           
Word6Extractor(DirectoryNode dir, POIFSFileSystem fs)
          Deprecated. Use Word6Extractor(DirectoryNode) instead
Word6Extractor(HWPFOldDocument doc)
          Create a new Word Extractor
Word6Extractor(java.io.InputStream is)
          Create a new Word Extractor
Word6Extractor(POIFSFileSystem fs)
          Create a new Word Extractor
 
Method Summary
 java.lang.String[] getParagraphText()
          Deprecated. 
 java.lang.String getText()
          Retrieves all the text from the document.
 
Methods inherited from class org.apache.poi.POIOLE2TextExtractor
getDocSummaryInformation, getFileSystem, getMetadataTextExtractor, getRoot, getSummaryInformation
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Word6Extractor

public Word6Extractor(java.io.InputStream is)
               throws java.io.IOException
Create a new Word Extractor

Parameters:
is - InputStream containing the word file
Throws:
java.io.IOException

Word6Extractor

public Word6Extractor(POIFSFileSystem fs)
               throws java.io.IOException
Create a new Word Extractor

Parameters:
fs - POIFSFileSystem containing the word file
Throws:
java.io.IOException

Word6Extractor

@Deprecated
public Word6Extractor(DirectoryNode dir,
                                 POIFSFileSystem fs)
               throws java.io.IOException
Deprecated. Use Word6Extractor(DirectoryNode) instead

Throws:
java.io.IOException

Word6Extractor

public Word6Extractor(DirectoryNode dir)
               throws java.io.IOException
Throws:
java.io.IOException

Word6Extractor

public Word6Extractor(HWPFOldDocument doc)
Create a new Word Extractor

Parameters:
doc - The HWPFOldDocument to extract from
Method Detail

getParagraphText

@Deprecated
public java.lang.String[] getParagraphText()
Deprecated. 

Get the text from the word file, as an array with one String per paragraph


getText

public java.lang.String getText()
Description copied from class: POITextExtractor
Retrieves all the text from the document. How cells, paragraphs etc are separated in the text is implementation specific - see the javadocs for a specific project for details.

Specified by:
getText in class POITextExtractor
Returns:
All the text from the document


Copyright 2012 The Apache Software Foundation or its licensors, as applicable.