Package org.apache.poi.hpsf


Class Summary
ClassID Represents a class ID (16 bytes).
Constants Defines constants of general use.
CustomProperties Maintains the instances of CustomProperty that belong to a DocumentSummaryInformation.
CustomProperty This class represents custom properties in the document summary information stream.
DocumentSummaryInformation Convenience class representing a DocumentSummary Information stream in a Microsoft Office document.
MutableProperty Adds writing capability to the Property class.
MutablePropertySet Adds writing support to the PropertySet class.
MutableSection Adds writing capability to the Section class.
Property A property in a Section of a PropertySet.
PropertySet Represents a property set in the Horrible Property Set Format (HPSF).
PropertySetFactory Factory class to create instances of SummaryInformation, DocumentSummaryInformation and PropertySet.
Section Represents a section in a PropertySet.
SpecialPropertySet Abstract superclass for the convenience classes SummaryInformation and DocumentSummaryInformation.
SummaryInformation Convenience class representing a Summary Information stream in a Microsoft Office document.
Thumbnail Class to manipulate data in the Clipboard Variant (VT_CF) format.
TypeWriter Class for writing little-endian data and more.
Util Provides various static utility methods.
Variant The Variant types as defined by Microsoft's COM.
VariantSupport Supports reading and writing of variant data.

Exception Summary
HPSFException This exception is the superclass of all other checked exceptions thrown in this package.
HPSFRuntimeException This exception is the superclass of all other unchecked exceptions thrown in this package.
IllegalPropertySetDataException This exception is thrown when there is an illegal value set in a PropertySet.
IllegalVariantTypeException This exception is thrown if HPSF encounters a variant type that is illegal in the current context.
MarkUnsupportedException This exception is thrown if an InputStream does not support the InputStream.mark(int) operation.
MissingSectionException This exception is thrown if one of the PropertySet's convenience methods does not find a required Section.
NoFormatIDException This exception is thrown if a MutablePropertySet is to be written but does not have a formatID set (see MutableSection.setFormatID(ClassID) or MutableSection.setFormatID(byte[]).
NoPropertySetStreamException This exception is thrown if a format error in a property set stream is detected or when the input data do not constitute a property set stream.
NoSingleSectionException This exception is thrown if one of the PropertySet's convenience methods that require a single Section is called and the PropertySet does not contain exactly one Section.
ReadingNotSupportedException This exception is thrown when HPSF tries to read a (yet) unsupported variant type.
UnexpectedPropertySetTypeException This exception is thrown if a certain type of property set is expected (e.g.
UnsupportedVariantTypeException This exception is thrown if HPSF encounters a variant type that isn't supported yet.
VariantTypeException This exception is thrown if HPSF encounters a problem with a variant type.
WritingNotSupportedException This exception is thrown when trying to write a (yet) unsupported variant type.

Package org.apache.poi.hpsf Description

Processes streams in the Horrible Property Set Format (HPSF) in POI filesystems. Microsoft Office documents, i.e. POI filesystems, usually contain meta data like author, title, last saving time etc. These items are called properties and stored in property set streams along with the document itself. These streams are commonly named \005SummaryInformation and \005DocumentSummaryInformation. However, a POI filesystem may contain further property sets of other names or types.

In order to extract the properties from a POI filesystem, a property set stream's contents must be parsed into a PropertySet instance. Its subclasses SummaryInformation and DocumentSummaryInformation deal with the well-known property set streams \005SummaryInformation and \005DocumentSummaryInformation. (However, the streams' names are irrelevant. What counts is the property set's first section's format ID - see below.)

The factory method PropertySetFactory.create( creates a PropertySet instance. This method always returns the most specific property set: If it identifies the stream data as a Summary Information or as a Document Summary Information it returns an instance of the corresponding class, else the general PropertySet.

A PropertySet contains a list of Sections which can be retrieved with PropertySet.getSections(). Each Section contains a Property array which can be retrieved with Section.getProperties(). Since the vast majority of PropertySets contains only a single Section, the convenience method PropertySet.getProperties() returns the properties of a PropertySet's Section (throwing a NoSingleSectionException if the PropertySet contains more (or less) than exactly one Section).

Each Property has an ID, a type, and a value which can be retrieved with Property.getID(), Property.getType(), and Property.getValue(), respectively. The value's class depends on the property's type. The current implementation does not yet support all property types and restricts the values' classes to String, Integer and Date. A value of a yet unknown type is returned as a byte array containing the value's origin bytes from the property set stream.

To retrieve the value of a specific Property, use Section.getProperty(long) or Section.getPropertyIntValue(long).

The SummaryInformation and DocumentSummaryInformation classes provide convenience methods for retrieving well-known properties. For example, an application that wants to retrieve a document's title string just calls SummaryInformation.getTitle() instead of going through the hassle of first finding out what the title's property ID is and then using this ID to get the property's value.

Writing properties can be done with the classes MutablePropertySet, MutableSection, and MutableProperty.

Public documentation from Microsoft can be found in the appropriate section of the MSDN Library.



PropertySetFactory.create(InputStream) no longer throws an UnexpectedPropertySetTypeException.

To Do

The following is still left to be implemented. Sponsering could foster these issues considerably.

  • Convenience methods for setting summary information and document summary information properties

  • Better codepage support

  • Support for more property (variant) types

Rainer Klute (

Copyright 2012 The Apache Software Foundation or its licensors, as applicable.