org.apache.lucene.benchmark.byTask.feeds
Class ContentSource
java.lang.Object
org.apache.lucene.benchmark.byTask.feeds.ContentItemsSource
org.apache.lucene.benchmark.byTask.feeds.ContentSource
- All Implemented Interfaces:
- Closeable
- Direct Known Subclasses:
- DirContentSource, EnwikiContentSource, LineDocSource, LongToEnglishContentSource, ReutersContentSource, SingleDocSource, TrecContentSource
public abstract class ContentSource
- extends ContentItemsSource
Represents content from a specified source, such as TREC, Reuters etc. A
ContentSource
is responsible for creating DocData
objects for
its documents to be consumed by DocMaker
. It also keeps track
of various statistics, such as how many documents were generated, size in
bytes etc.
For supported configuration parameters see ContentItemsSource
.
Methods inherited from class org.apache.lucene.benchmark.byTask.feeds.ContentItemsSource |
addBytes, addItem, close, collectFiles, getBytesCount, getConfig, getItemsCount, getTotalBytesCount, getTotalItemsCount, printStatistics, resetInputs, setConfig, shouldLog |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ContentSource
public ContentSource()
getNextDocData
public abstract DocData getNextDocData(DocData docData)
throws NoMoreDataException,
IOException
- Returns the next
DocData
from the content source.
Implementations must account for multi-threading, as multiple threads
can call this method simultaneously.
- Throws:
NoMoreDataException
IOException