org.apache.poi.hwpf
Class HWPFDocumentCore

java.lang.Object
  extended by org.apache.poi.POIDocument
      extended by org.apache.poi.hwpf.HWPFDocumentCore
Direct Known Subclasses:
HWPFDocument, HWPFOldDocument

public abstract class HWPFDocumentCore
extends POIDocument

This class holds much of the core of a Word document, but without some of the table structure information. You generally want to work with one of HWPFDocument or HWPFOldDocument


Field Summary
protected  CHPBinTable _cbt
          Contains formatting properties for text
protected  FileInformationBlock _fib
          The FIB
protected  FontTable _ft
          Holds fonts for this document.
protected  ListTables _lt
          Hold list tables
protected  byte[] _mainStream
          main document stream buffer
protected  ObjectPoolImpl _objectPool
          Holds OLE2 objects
protected  PAPBinTable _pbt
          Contains formatting properties for paragraphs
protected  StyleSheet _ss
          Holds styles for this document.
protected  SectionTable _st
          Contains formatting properties for sections.
protected static java.lang.String STREAM_OBJECT_POOL
           
protected static java.lang.String STREAM_WORD_DOCUMENT
           
 
Fields inherited from class org.apache.poi.POIDocument
directory
 
Constructor Summary
protected HWPFDocumentCore()
           
  HWPFDocumentCore(DirectoryNode directory)
          This constructor loads a Word document from a specific point in a POIFSFileSystem, probably not the default.
  HWPFDocumentCore(java.io.InputStream istream)
          This constructor loads a Word document from an InputStream.
  HWPFDocumentCore(POIFSFileSystem pfilesystem)
          This constructor loads a Word document from a POIFSFileSystem
 
Method Summary
 CHPBinTable getCharacterTable()
           
 java.lang.String getDocumentText()
          Returns document text, i.e.
 FileInformationBlock getFileInformationBlock()
           
 FontTable getFontTable()
           
 ListTables getListTables()
           
 ObjectsPool getObjectsPool()
           
abstract  Range getOverallRange()
          Returns the range that covers all text in the file, including main text, footnotes, headers and comments
 PAPBinTable getParagraphTable()
           
abstract  Range getRange()
          Returns the range which covers the whole of the document, but excludes any headers and footers.
 SectionTable getSectionTable()
           
 StyleSheet getStyleSheet()
           
abstract  java.lang.StringBuilder getText()
          Internal method to access document text
abstract  TextPieceTable getTextTable()
           
static POIFSFileSystem verifyAndBuildPOIFS(java.io.InputStream istream)
          Takens an InputStream, verifies that it's not RTF, builds a POIFSFileSystem from it, and returns that.
 
Methods inherited from class org.apache.poi.POIDocument
copyNodeRecursively, copyNodes, copyNodes, createInformationProperties, getDocumentSummaryInformation, getPropertySet, getSummaryInformation, readProperties, write, writeProperties, writeProperties, writePropertySet
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

STREAM_OBJECT_POOL

protected static final java.lang.String STREAM_OBJECT_POOL
See Also:
Constant Field Values

STREAM_WORD_DOCUMENT

protected static final java.lang.String STREAM_WORD_DOCUMENT
See Also:
Constant Field Values

_objectPool

protected ObjectPoolImpl _objectPool
Holds OLE2 objects


_fib

protected FileInformationBlock _fib
The FIB


_ss

protected StyleSheet _ss
Holds styles for this document.


_cbt

protected CHPBinTable _cbt
Contains formatting properties for text


_pbt

protected PAPBinTable _pbt
Contains formatting properties for paragraphs


_st

protected SectionTable _st
Contains formatting properties for sections.


_ft

protected FontTable _ft
Holds fonts for this document.


_lt

protected ListTables _lt
Hold list tables


_mainStream

protected byte[] _mainStream
main document stream buffer

Constructor Detail

HWPFDocumentCore

protected HWPFDocumentCore()

HWPFDocumentCore

public HWPFDocumentCore(java.io.InputStream istream)
                 throws java.io.IOException
This constructor loads a Word document from an InputStream.

Parameters:
istream - The InputStream that contains the Word document.
Throws:
java.io.IOException - If there is an unexpected IOException from the passed in InputStream.

HWPFDocumentCore

public HWPFDocumentCore(POIFSFileSystem pfilesystem)
                 throws java.io.IOException
This constructor loads a Word document from a POIFSFileSystem

Parameters:
pfilesystem - The POIFSFileSystem that contains the Word document.
Throws:
java.io.IOException - If there is an unexpected IOException from the passed in POIFSFileSystem.

HWPFDocumentCore

public HWPFDocumentCore(DirectoryNode directory)
                 throws java.io.IOException
This constructor loads a Word document from a specific point in a POIFSFileSystem, probably not the default. Used typically to open embeded documents.

Parameters:
directory - The DirectoryNode that contains the Word document.
Throws:
java.io.IOException - If there is an unexpected IOException from the passed in POIFSFileSystem.
Method Detail

verifyAndBuildPOIFS

public static POIFSFileSystem verifyAndBuildPOIFS(java.io.InputStream istream)
                                           throws java.io.IOException
Takens an InputStream, verifies that it's not RTF, builds a POIFSFileSystem from it, and returns that.

Throws:
java.io.IOException

getRange

public abstract Range getRange()
Returns the range which covers the whole of the document, but excludes any headers and footers.


getOverallRange

public abstract Range getOverallRange()
Returns the range that covers all text in the file, including main text, footnotes, headers and comments


getDocumentText

public java.lang.String getDocumentText()
Returns document text, i.e. text information from all text pieces, including OLE descriptions and field codes


getText

@Internal
public abstract java.lang.StringBuilder getText()
Internal method to access document text


getCharacterTable

public CHPBinTable getCharacterTable()

getParagraphTable

public PAPBinTable getParagraphTable()

getSectionTable

public SectionTable getSectionTable()

getStyleSheet

public StyleSheet getStyleSheet()

getListTables

public ListTables getListTables()

getFontTable

public FontTable getFontTable()

getFileInformationBlock

public FileInformationBlock getFileInformationBlock()

getObjectsPool

public ObjectsPool getObjectsPool()

getTextTable

public abstract TextPieceTable getTextTable()


Copyright 2012 The Apache Software Foundation or its licensors, as applicable.