org.apache.lucene.benchmark.byTask.feeds
Class TrecLATimesParser

java.lang.Object
  extended by org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
      extended by org.apache.lucene.benchmark.byTask.feeds.TrecLATimesParser

public class TrecLATimesParser
extends TrecDocParser

Parser for the FT docs in trec disks 4+5 collection format


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
TrecDocParser.ParsePathType
 
Field Summary
 
Fields inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
DEFAULT_PATH_TYPE
 
Constructor Summary
TrecLATimesParser()
           
 
Method Summary
 DocData parse(DocData docData, String name, TrecContentSource trecSrc, StringBuilder docBuf, TrecDocParser.ParsePathType pathType)
          parse the text prepared in docBuf into a result DocData, no synchronization is required.
 
Methods inherited from class org.apache.lucene.benchmark.byTask.feeds.TrecDocParser
extract, pathType, stripTags, stripTags
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TrecLATimesParser

public TrecLATimesParser()
Method Detail

parse

public DocData parse(DocData docData,
                     String name,
                     TrecContentSource trecSrc,
                     StringBuilder docBuf,
                     TrecDocParser.ParsePathType pathType)
              throws IOException,
                     InterruptedException
Description copied from class: TrecDocParser
parse the text prepared in docBuf into a result DocData, no synchronization is required.

Specified by:
parse in class TrecDocParser
Parameters:
docData - reusable result
name - name that should be set to the result
trecSrc - calling trec content source
docBuf - text to parse
pathType - type of parsed file, or null if unknown - may be used by parsers to alter their behavior according to the file path type.
Throws:
IOException
InterruptedException