API Documentation

Search/Lucene/Index/SegmentInfo.php

Includes Classes 
category
Zend
copyright
Copyright (c) 2005-2010 Zend Technologies USA Inc. (http://www.zend.com)
license
http://framework.zend.com/license/new-bsd New BSD License
package
Zend_Search_Lucene
subpackage
Index
version
$Id: SegmentInfo.php 22988 2010-09-21 10:53:41Z alexander $
Classes
Zend_Search_Lucene_Index_SegmentInfo

Description

Zend Framework

LICENSE

This source file is subject to the new BSD license that is bundled with this package in the file LICENSE.txt. It is also available through the world-wide-web at this URL: http://framework.zend.com/license/new-bsd If you did not receive a copy of the license and are unable to obtain it through the world-wide-web, please send an email to license@zend.com so we can send you a copy immediately.

Zend_Search_Lucene_Index_SegmentInfo

Implements
Zend_Search_Lucene_Index_TermsStream_Interface
category
Zend
copyright
Copyright (c) 2005-2010 Zend Technologies USA Inc. (http://www.zend.com)
license
http://framework.zend.com/license/new-bsd New BSD License
package
Zend_Search_Lucene
subpackage
Index
Constants
FULL_SCAN_VS_FETCH_BOUNDARY
SM_TERMS_ONLY
SM_FULL_INFO
SM_MERGE_INFO
Properties
$_docCount
$_name
$_termDictionary
$_termDictionaryInfos
$_fields
$_fieldsDicPositions
$_segFiles
$_segFileSizes
$_delGen
$_hasSingleNormFile
$_isCompound
$_directory
$_norms
$_deleted
$_deletedDirty
$_usesSharedDocStore
$_sharedDocStoreOptions
$_termInfoCache
$_tisFile
$_tisFileOffset
$_frqFile
$_frqFileOffset
$_prxFile
$_prxFileOffset
$_termCount
$_termNum
$_indexInterval
$_skipInterval
$_lastTermInfo
$_lastTerm
$_docMap
$_lastTermPositions
$_termsScanMode
Methods
__construct
_loadDelFile
_loadPre21DelFile
_load21DelFile
openCompoundFile
compoundFileLength
getFieldNum
getField
getFields
getFieldInfos
getDelGen
count
_deletedCount
numDocs
_getFieldPosition
getName
_cleanUpTermInfoCache
_loadDictionaryIndex
getTermInfo
termDocs
termFreqs
termPositions
_loadNorm
norm
normVector
hasDeletions
hasSingleNormFile
isCompound
delete
isDeleted
_detectLatestDelGen
writeChanges
resetTermsStream
skipTo
nextTerm
closeTermsStream
currentTerm
currentTermPositions

Description

Constants

FULL_SCAN_VS_FETCH_BOUNDARY

 FULL_SCAN_VS_FETCH_BOUNDARY = '5'

"Full scan vs fetch" boundary.

If filter selectivity is less than this value, then full scan is performed (since term entries fetching has some additional overhead).

Details

value
5

SM_TERMS_ONLY

 SM_TERMS_ONLY = '0'

Scan modes

Details

value
0

SM_FULL_INFO

 SM_FULL_INFO = '1'

Details

value
1

SM_MERGE_INFO

 SM_MERGE_INFO = '2'

Details

value
2

Properties

$_delGen

integer $_delGen = ''

Delete file generation number

-2 means autodetect latest delete generation -1 means 'there is no delete file' 0 means pre-2.1 format delete file X specifies used delete file

Details

$_delGen
integer
visibility
private
default
final
false
static
false

$_deleted

mixed $_deleted = 'null'

List of deleted documents.

bitset if bitset extension is loaded or array otherwise.

Details

$_deleted
mixed
visibility
private
default
null
final
false
static
false

$_deletedDirty

boolean $_deletedDirty = 'false'

$this->_deleted update flag

Details

$_deletedDirty
boolean
visibility
private
default
false
final
false
static
false

$_directory

Zend_Search_Lucene_Storage_Directory_Filesystem $_directory = ''

File system adapter.

Details

$_directory
Zend_Search_Lucene_Storage_Directory_Filesystem
visibility
private
default
final
false
static
false

$_docCount

integer $_docCount = ''

Number of docs in a segment

Details

$_docCount
integer
visibility
private
default
final
false
static
false

$_docMap

array|null $_docMap = 'null'

Map of the document IDs Used to get new docID after removing deleted documents.

It's not very effective from memory usage point of view, but much more faster, then other methods

Details

$_docMap
array|null
visibility
private
default
null
final
false
static
false

$_fields

array $_fields = ''

Segment fields. Array of Zend_Search_Lucene_Index_FieldInfo objects for this segment

Details

$_fields
array
visibility
private
default
final
false
static
false

$_fieldsDicPositions

array $_fieldsDicPositions = ''

Field positions in a dictionary.

(Term dictionary contains filelds ordered by names)

Details

$_fieldsDicPositions
array
visibility
private
default
final
false
static
false

$_frqFile

Zend_Search_Lucene_Storage_File $_frqFile = 'null'

Frequencies File object for stream like terms reading

Details

$_frqFile
Zend_Search_Lucene_Storage_File
visibility
private
default
null
final
false
static
false

$_frqFileOffset

integer $_frqFileOffset = ''

Actual offset of the .frq file data

Details

$_frqFileOffset
integer
visibility
private
default
final
false
static
false

$_hasSingleNormFile

boolean $_hasSingleNormFile = ''

Segment has single norms file

If true then one .nrm file is used for all fields Otherwise .fN files are used

Details

$_hasSingleNormFile
boolean
visibility
private
default
final
false
static
false

$_indexInterval

integer $_indexInterval = ''

Segment index interval

Details

$_indexInterval
integer
visibility
private
default
final
false
static
false

$_isCompound

boolean $_isCompound = ''

Use compound segment file (*.cfs) to collect all other segment files (excluding .del files)

Details

$_isCompound
boolean
visibility
private
default
final
false
static
false

$_lastTerm

Zend_Search_Lucene_Index_Term $_lastTerm = 'null'

Last Term in a terms stream

Details

$_lastTerm
Zend_Search_Lucene_Index_Term
visibility
private
default
null
final
false
static
false

$_lastTermInfo

Zend_Search_Lucene_Index_TermInfo $_lastTermInfo = 'null'

Last TermInfo in a terms stream

Details

$_lastTermInfo
Zend_Search_Lucene_Index_TermInfo
visibility
private
default
null
final
false
static
false

$_lastTermPositions

array|null $_lastTermPositions = ''

An array of all term positions in the documents.

Array structure: array( docId => array( pos1, pos2, ...), ...)

Is set to null if term positions loading has to be skipped

Details

$_lastTermPositions
array|null
visibility
private
default
final
false
static
false

$_name

string $_name = ''

Segment name

Details

$_name
string
visibility
private
default
final
false
static
false

$_norms

array $_norms = 'array'

Normalization factors.

An array fieldName => normVector normVector is a binary string. Each byte corresponds to an indexed document in a segment and encodes normalization factor (float value, encoded by Zend_Search_Lucene_Search_Similarity::encodeNorm())

Details

$_norms
array
visibility
private
default
array
final
false
static
false

$_prxFile

Zend_Search_Lucene_Storage_File $_prxFile = 'null'

Positions File object for stream like terms reading

Details

$_prxFile
Zend_Search_Lucene_Storage_File
visibility
private
default
null
final
false
static
false

$_prxFileOffset

integer $_prxFileOffset = ''

Actual offset of the .prx file in the compound file

Details

$_prxFileOffset
integer
visibility
private
default
final
false
static
false

$_segFileSizes

array $_segFileSizes = ''

Associative array where the key is the file name and the value is file size (.csf).

Details

$_segFileSizes
array
visibility
private
default
final
false
static
false

$_segFiles

array $_segFiles = ''

Associative array where the key is the file name and the value is data offset in a compound segment file (.csf).

Details

$_segFiles
array
visibility
private
default
final
false
static
false

$_sharedDocStoreOptions

 $_sharedDocStoreOptions = ''

Details

visibility
private
default
final
false
static
false

$_skipInterval

integer $_skipInterval = ''

Segment skip interval

Details

$_skipInterval
integer
visibility
private
default
final
false
static
false

$_termCount

integer $_termCount = '0'

Actual number of terms in term stream

Details

$_termCount
integer
visibility
private
default
0
final
false
static
false

$_termDictionary

array $_termDictionary = ''

Term Dictionary Index

Array of arrays (Zend_Search_Lucene_Index_Term objects are represented as arrays because of performance considerations) [0] -> $termValue [1] -> $termFieldNum

Corresponding Zend_Search_Lucene_Index_TermInfo object stored in the $_termDictionaryInfos

Details

$_termDictionary
array
visibility
private
default
final
false
static
false

$_termDictionaryInfos

array $_termDictionaryInfos = ''

Term Dictionary Index TermInfos

Array of arrays (Zend_Search_Lucene_Index_TermInfo objects are represented as arrays because of performance considerations) [0] -> $docFreq [1] -> $freqPointer [2] -> $proxPointer [3] -> $skipOffset [4] -> $indexPointer

Details

$_termDictionaryInfos
array
visibility
private
default
final
false
static
false

$_termInfoCache

array $_termInfoCache = 'array'

TermInfo cache

Size is 1024. Numbers are used instead of class constants because of performance considerations

Details

$_termInfoCache
array
visibility
private
default
array
final
false
static
false

$_termNum

integer $_termNum = '0'

Overall number of terms in term stream

Details

$_termNum
integer
visibility
private
default
0
final
false
static
false

$_termsScanMode

integer $_termsScanMode = ''

Terms scan mode

Values:

self::SM_TERMS_ONLY - terms are scanned, no additional info is retrieved self::SM_FULL_INFO - terms are scanned, frequency and position info is retrieved self::SM_MERGE_INFO - terms are scanned, frequency and position info is retrieved document numbers are compacted (shifted if segment has deleted documents)

Details

$_termsScanMode
integer
visibility
private
default
final
false
static
false

$_tisFile

Zend_Search_Lucene_Storage_File $_tisFile = 'null'

Term Dictionary File object for stream like terms reading

Details

$_tisFile
Zend_Search_Lucene_Storage_File
visibility
private
default
null
final
false
static
false

$_tisFileOffset

integer $_tisFileOffset = ''

Actual offset of the .tis file data

Details

$_tisFileOffset
integer
visibility
private
default
final
false
static
false

$_usesSharedDocStore

boolean $_usesSharedDocStore = ''

True if segment uses shared doc store

Details

$_usesSharedDocStore
boolean
visibility
private
default
final
false
static
false

Methods

__construct

__construct( Zend_Search_Lucene_Storage_Directory $directory, string $name, integer $docCount, integer $delGen = 0, array|null $docStoreOptions = null, boolean $hasSingleNormFile = false, boolean $isCompound = null ) :

Zend_Search_Lucene_Index_SegmentInfo constructor

Arguments
$directory
Zend_Search_Lucene_Storage_Directory
$name
string
$docCount
integer
$delGen
integer
$docStoreOptions
arraynull
$hasSingleNormFile
boolean
$isCompound
boolean
Details
visibility
public
final
false
static
false

_cleanUpTermInfoCache

_cleanUpTermInfoCache( ) :
Details
visibility
private
final
false
static
false

_deletedCount

_deletedCount( ) : integer

Returns number of deleted documents.

Output
integer
Details
visibility
private
final
false
static
false

_detectLatestDelGen

_detectLatestDelGen( ) : integer

Detect latest delete generation

Is actualy used from writeChanges() method or from the constructor if it's invoked from Index writer. In both cases index write lock is already obtained, so we shouldn't care about it

Output
integer
Details
visibility
private
final
false
static
false

_getFieldPosition

_getFieldPosition( integer $fieldNum ) : integer

Get field position in a fields dictionary

Arguments
$fieldNum
integer
Output
integer
Details
visibility
private
final
false
static
false

_load21DelFile

_load21DelFile( ) : mixed

Load 2.1+ format detetions file

Returns bitset or an array depending on bitset extension availability

Output
mixed
Details
visibility
private
final
false
static
false

_loadDelFile

_loadDelFile( ) : mixed

Load detetions file

Returns bitset or an array depending on bitset extension availability

Output
mixed
Details
visibility
private
final
false
static
false
throws

_loadDictionaryIndex

_loadDictionaryIndex( ) :

Load terms dictionary index

Details
visibility
private
final
false
static
false
throws

_loadNorm

_loadNorm( integer $fieldNum ) :

Load normalizatin factors from an index file

Arguments
$fieldNum
integer
Details
visibility
private
final
false
static
false
throws

_loadPre21DelFile

_loadPre21DelFile( ) : mixed

Load pre-2.1 detetions file

Returns bitset or an array depending on bitset extension availability

Output
mixed
Details
visibility
private
final
false
static
false
throws

closeTermsStream

closeTermsStream( ) :

Close terms stream

Should be used for resources clean up if stream is not read up to the end

Details
visibility
public
final
false
static
false

compoundFileLength

compoundFileLength( string $extension ) : integer

Get compound file length

Arguments
$extension
string
Output
integer
Details
visibility
public
final
false
static
false

count

count( ) : integer

Returns the total number of documents in this segment (including deleted documents).

Output
integer
Details
visibility
public
final
false
static
false

currentTerm

currentTerm( ) : Zend_Search_Lucene_Index_Term|null

Returns term in current position

Details
visibility
public
final
false
static
false

currentTermPositions

currentTermPositions( ) : array

Returns an array of all term positions in the documents.

Return array structure: array( docId => array( pos1, pos2, ...), ...)

Output
array
Details
visibility
public
final
false
static
false

delete

delete(  $id ) :

Deletes a document from the index segment.

$id is an internal document id

Arguments
$id
integer
Details
visibility
public
final
false
static
false

getDelGen

getDelGen( ) : integer

Returns actual deletions file generation number.

Output
integer
Details
visibility
public
final
false
static
false

getField

getField( integer $fieldNum ) : Zend_Search_Lucene_Index_FieldInfo

Returns field info for specified field

Arguments
$fieldNum
integer
Details
visibility
public
final
false
static
false

getFieldInfos

getFieldInfos( ) : array

Returns array of FieldInfo objects.

Output
array
Details
visibility
public
final
false
static
false

getFieldNum

getFieldNum( string $fieldName ) : integer

Returns field index or -1 if field is not found

Arguments
$fieldName
string
Output
integer
Details
visibility
public
final
false
static
false

getFields

getFields( boolean $indexed = false ) : array

Returns array of fields.

if $indexed parameter is true, then returns only indexed fields.

Arguments
$indexed
boolean
Output
array
Details
visibility
public
final
false
static
false

getName

getName( ) : string

Return segment name

Output
string
Details
visibility
public
final
false
static
false

getTermInfo

getTermInfo( Zend_Search_Lucene_Index_Term $term ) : Zend_Search_Lucene_Index_TermInfo

Scans terms dictionary and returns term info

Arguments
$term
Zend_Search_Lucene_Index_Term
Details
visibility
public
final
false
static
false

hasDeletions

hasDeletions( ) : boolean

Returns true if any documents have been deleted from this index segment.

Output
boolean
Details
visibility
public
final
false
static
false

hasSingleNormFile

hasSingleNormFile( ) : boolean

Returns true if segment has single norms file.

Output
boolean
Details
visibility
public
final
false
static
false

isCompound

isCompound( ) : boolean

Returns true if segment is stored using compound segment file.

Output
boolean
Details
visibility
public
final
false
static
false

isDeleted

isDeleted(  $id ) : boolean

Checks, that document is deleted

Arguments
$id
integer
Output
boolean
Details
visibility
public
final
false
static
false

nextTerm

nextTerm( ) : Zend_Search_Lucene_Index_Term|null

Scans terms dictionary and returns next term

Details
visibility
public
final
false
static
false

norm

norm( integer $id, string $fieldName ) : float

Returns normalization factor for specified documents

Arguments
$id
integer
$fieldName
string
Output
float
Details
visibility
public
final
false
static
false

normVector

normVector( string $fieldName ) : string

Returns norm vector, encoded in a byte string

Arguments
$fieldName
string
Output
string
Details
visibility
public
final
false
static
false

numDocs

numDocs( ) : integer

Returns the total number of non-deleted documents in this segment.

Output
integer
Details
visibility
public
final
false
static
false

openCompoundFile

openCompoundFile( string $extension, boolean $shareHandler = true ) : Zend_Search_Lucene_Storage_File

Opens index file stoted within compound index file

Arguments
$extension
string
$shareHandler
boolean
Details
visibility
public
final
false
static
false
throws

resetTermsStream

resetTermsStream( ) : integer

Reset terms stream

$startId - id for the fist document $compact - remove deleted documents

Returns start document id for the next segment

Output
integer
Details
visibility
public
final
false
static
false
throws

skipTo

skipTo( Zend_Search_Lucene_Index_Term $prefix ) :

Skip terms stream up to the specified term preffix.

Prefix contains fully specified field info and portion of searched term

Arguments
$prefix
Zend_Search_Lucene_Index_Term
Details
visibility
public
final
false
static
false
throws

termDocs

termDocs( Zend_Search_Lucene_Index_Term $term, integer $shift = 0, Zend_Search_Lucene_Index_DocsFilter|null $docsFilter = null ) : array

Returns IDs of all the documents containing term.

Arguments
$term
Zend_Search_Lucene_Index_Term
$shift
integer
$docsFilter
Zend_Search_Lucene_Index_DocsFilternull
Output
array
Details
visibility
public
final
false
static
false

termFreqs

termFreqs( Zend_Search_Lucene_Index_Term $term, integer $shift = 0, Zend_Search_Lucene_Index_DocsFilter|null $docsFilter = null ) : Zend_Search_Lucene_Index_TermInfo

Returns term freqs array.

Result array structure: array(docId => freq, ...)

Arguments
$term
Zend_Search_Lucene_Index_Term
$shift
integer
$docsFilter
Zend_Search_Lucene_Index_DocsFilternull
Details
visibility
public
final
false
static
false

termPositions

termPositions( Zend_Search_Lucene_Index_Term $term, integer $shift = 0, Zend_Search_Lucene_Index_DocsFilter|null $docsFilter = null ) : Zend_Search_Lucene_Index_TermInfo

Returns term positions array.

Result array structure: array(docId => array(pos1, pos2, ...), ...)

Arguments
$term
Zend_Search_Lucene_Index_Term
$shift
integer
$docsFilter
Zend_Search_Lucene_Index_DocsFilternull
Details
visibility
public
final
false
static
false

writeChanges

writeChanges( ) :

Write changes if it's necessary.

This method must be invoked only from the Writer _updateSegments() method, so index Write lock has to be already obtained.

Details
visibility
public
final
false
static
false
internal
throws
Documentation was generated by DocBlox.