API Documentation

Search/Lucene.php

Includes Classes 
category
Zend
copyright
Copyright (c) 2005-2010 Zend Technologies USA Inc. (http://www.zend.com)
license
http://framework.zend.com/license/new-bsd New BSD License
package
Zend_Search_Lucene
version
$Id: Lucene.php 22988 2010-09-21 10:53:41Z alexander $
Classes
Zend_Search_Lucene

Description

Zend Framework

LICENSE

This source file is subject to the new BSD license that is bundled with this package in the file LICENSE.txt. It is also available through the world-wide-web at this URL: http://framework.zend.com/license/new-bsd If you did not receive a copy of the license and are unable to obtain it through the world-wide-web, please send an email to license@zend.com so we can send you a copy immediately.

Zend_Search_Lucene

Implements
Zend_Search_Lucene_Interface
category
Zend
copyright
Copyright (c) 2005-2010 Zend Technologies USA Inc. (http://www.zend.com)
license
http://framework.zend.com/license/new-bsd New BSD License
package
Zend_Search_Lucene
Constants
FORMAT_PRE_2_1
FORMAT_2_1
FORMAT_2_3
GENERATION_RETRIEVE_COUNT
GENERATION_RETRIEVE_PAUSE
Properties
$_defaultSearchField
$_resultSetLimit
$_termsPerQueryLimit
$_directory
$_closeDirOnExit
$_writer
$_segmentInfos
$_docCount
$_hasChanges
$_closed
$_refCount
$_generation
$_formatVersion
$_termsStream
Methods
create
open
getActualGeneration
getGeneration
getSegmentFileName
getFormatVersion
setFormatVersion
_readPre21SegmentsFile
_readSegmentsFile
__construct
_close
addReference
removeReference
__destruct
_getIndexWriter
getDirectory
count
maxDoc
numDocs
isDeleted
setDefaultSearchField
getDefaultSearchField
setResultSetLimit
getResultSetLimit
setTermsPerQueryLimit
getTermsPerQueryLimit
getMaxBufferedDocs
setMaxBufferedDocs
getMaxMergeDocs
setMaxMergeDocs
getMergeFactor
setMergeFactor
find
getFieldNames
getDocument
hasTerm
termDocs
termDocsFilter
termFreqs
termPositions
docFreq
getSimilarity
norm
hasDeletions
delete
addDocument
_updateDocCount
commit
optimize
terms
resetTermsStream
skipTo
nextTerm
currentTerm
closeTermsStream
undeleteAll

Description

Constants

FORMAT_PRE_2_1

 FORMAT_PRE_2_1 = '0'

Details

value
0

FORMAT_2_1

 FORMAT_2_1 = '1'

Details

value
1

FORMAT_2_3

 FORMAT_2_3 = '2'

Details

value
2

GENERATION_RETRIEVE_COUNT

 GENERATION_RETRIEVE_COUNT = '10'

Generation retrieving counter

Details

value
10

GENERATION_RETRIEVE_PAUSE

 GENERATION_RETRIEVE_PAUSE = '50'

Pause between generation retrieving attempts in milliseconds

Details

value
50

Properties

$_closeDirOnExit

boolean $_closeDirOnExit = 'true'

File system adapter closing option

Details

$_closeDirOnExit
boolean
visibility
private
default
true
final
false
static
false

$_closed

boolean $_closed = 'false'

Signal, that index is already closed, changes are fixed and resources are cleaned up

Details

$_closed
boolean
visibility
private
default
false
final
false
static
false

$_defaultSearchField

string $_defaultSearchField = 'null'

Default field name for search

Null means search through all fields

Details

$_defaultSearchField
string
visibility
private
default
null
final
false
static
true

$_directory

Zend_Search_Lucene_Storage_Directory $_directory = 'null'

File system adapter.

Details

$_directory
Zend_Search_Lucene_Storage_Directory
visibility
private
default
null
final
false
static
false

$_docCount

integer $_docCount = '0'

Number of documents in this index.

Details

$_docCount
integer
visibility
private
default
0
final
false
static
false

$_formatVersion

integer $_formatVersion = ''

Index format version

Details

$_formatVersion
integer
visibility
private
default
final
false
static
false

$_generation

integer $_generation = 'FORMAT_PRE_2_1'

Current segment generation

Details

$_generation
integer
visibility
private
default
FORMAT_PRE_2_1
final
false
static
false

$_hasChanges

boolean $_hasChanges = 'false'

Flag for index changes

Details

$_hasChanges
boolean
visibility
private
default
false
final
false
static
false

$_refCount

integer $_refCount = '0'

Number of references to the index object

Details

$_refCount
integer
visibility
private
default
0
final
false
static
false

$_resultSetLimit

integer $_resultSetLimit = '0'

Result set limit

0 means no limit

Details

$_resultSetLimit
integer
visibility
private
default
0
final
false
static
true

$_segmentInfos

array $_segmentInfos = 'array'

Array of Zend_Search_Lucene_Index_SegmentInfo objects for current version of index.

Details

$_segmentInfos
array
Zend_Search_Lucene_Index_SegmentInfo
visibility
private
default
array
final
false
static
false

$_termsPerQueryLimit

integer $_termsPerQueryLimit = '1024'

Terms per query limit

0 means no limit

Details

$_termsPerQueryLimit
integer
visibility
private
default
1024
final
false
static
true

$_termsStream

Zend_Search_Lucene_TermStreamsPriorityQueue $_termsStream = 'null'

Terms stream priority queue object

Details

$_termsStream
Zend_Search_Lucene_TermStreamsPriorityQueue
visibility
private
default
null
final
false
static
false

$_writer

Zend_Search_Lucene_Index_Writer $_writer = 'null'

Writer for this index, not instantiated unless required.

Details

$_writer
Zend_Search_Lucene_Index_Writer
visibility
private
default
null
final
false
static
false

Methods

__construct

__construct( Zend_Search_Lucene_Storage_Directory_Filesystem|string $directory = null,  $create = false ) :

Opens the index.

IndexReader constructor needs Directory as a parameter. It should be a string with a path to the index folder or a Directory object.

Arguments
$directory
Zend_Search_Lucene_Storage_Directory_Filesystemstring
$create
Details
visibility
public
final
false
static
false
throws

__destruct

__destruct( ) :

Object destructor

Details
visibility
public
final
false
static
false

_close

_close( ) :

Close current index and free resources

Details
visibility
private
final
false
static
false

_getIndexWriter

_getIndexWriter( ) : Zend_Search_Lucene_Index_Writer

Returns an instance of Zend_Search_Lucene_Index_Writer for the index

Details
visibility
private
final
false
static
false

_readPre21SegmentsFile

_readPre21SegmentsFile( ) :

Read segments file for pre-2.1 Lucene index format

Details
visibility
private
final
false
static
false
throws

_readSegmentsFile

_readSegmentsFile( ) :

Read segments file

Details
visibility
private
final
false
static
false
throws

_updateDocCount

_updateDocCount( ) :

Update document counter

Details
visibility
private
final
false
static
false

addDocument

addDocument( Zend_Search_Lucene_Document $document ) :

Adds a document to this index.

Arguments
$document
Zend_Search_Lucene_Document
Details
visibility
public
final
false
static
false

addReference

addReference( ) :

Add reference to the index object

Details
visibility
public
final
false
static
false
internal

closeTermsStream

closeTermsStream( ) :

Close terms stream

Should be used for resources clean up if stream is not read up to the end

Details
visibility
public
final
false
static
false

commit

commit( ) :

Commit changes resulting from delete() or undeleteAll() operations.

Details
visibility
public
final
false
static
false
todo
undeleteAll processing.

count

count( ) : integer

Returns the total number of documents in this index (including deleted documents).

Output
integer
Details
visibility
public
final
false
static
false

create

create( mixed $directory ) : Zend_Search_Lucene_Interface

Create index

Arguments
$directory
mixed
Details
visibility
public
final
false
static
true

currentTerm

currentTerm( ) : Zend_Search_Lucene_Index_Term|null

Returns term in current position

Details
visibility
public
final
false
static
false

delete

delete( integer|Zend_Search_Lucene_Search_QueryHit $id ) :

Deletes a document from the index.

$id is an internal document id

Arguments
$id
integerZend_Search_Lucene_Search_QueryHit
Details
visibility
public
final
false
static
false
throws

docFreq

docFreq( Zend_Search_Lucene_Index_Term $term ) : integer

Returns the number of documents in this index containing the $term.

Arguments
$term
Zend_Search_Lucene_Index_Term
Output
integer
Details
visibility
public
final
false
static
false

find

find( Zend_Search_Lucene_Search_QueryParser|string $query ) : array

Performs a query against the index and returns an array of Zend_Search_Lucene_Search_QueryHit objects.

Input is a string or Zend_Search_Lucene_Search_Query.

Arguments
$query
Zend_Search_Lucene_Search_QueryParserstring
Output
array
Zend_Search_Lucene_Search_QueryHit
Details
visibility
public
final
false
static
false
throws

getActualGeneration

getActualGeneration( Zend_Search_Lucene_Storage_Directory $directory ) : integer

Get current generation number

Returns generation number 0 means pre-2.1 index format -1 means there are no segments files.

Arguments
$directory
Zend_Search_Lucene_Storage_Directory
Output
integer
Details
visibility
public
final
false
static
true
throws

getDefaultSearchField

getDefaultSearchField( ) : string

Get default search field.

Null means, that search is performed through all fields by default

Output
string
Details
visibility
public
final
false
static
true

getDirectory

getDirectory( ) : Zend_Search_Lucene_Storage_Directory

Returns the Zend_Search_Lucene_Storage_Directory instance for this index.

Details
visibility
public
final
false
static
false

getDocument

getDocument( integer|Zend_Search_Lucene_Search_QueryHit $id ) : Zend_Search_Lucene_Document

Returns a Zend_Search_Lucene_Document object for the document number $id in this index.

Arguments
$id
integerZend_Search_Lucene_Search_QueryHit
Details
visibility
public
final
false
static
false
throws
Exception is thrown if $id is out of the range

getFieldNames

getFieldNames( boolean $indexed = false ) : array

Returns a list of all unique field names that exist in this index.

Arguments
$indexed
boolean
Output
array
Details
visibility
public
final
false
static
false

getFormatVersion

getFormatVersion( ) : integer

Get index format version

Output
integer
Details
visibility
public
final
false
static
false

getGeneration

getGeneration( ) : integer

Get generation number associated with this index instance

The same generation number in pair with document number or query string guarantees to give the same result while index retrieving. So it may be used for search result caching.

Output
integer
Details
visibility
public
final
false
static
false

getMaxBufferedDocs

getMaxBufferedDocs( ) : integer

Retrieve index maxBufferedDocs option

maxBufferedDocs is a minimal number of documents required before the buffered in-memory documents are written into a new Segment

Default value is 10

Output
integer
Details
visibility
public
final
false
static
false

getMaxMergeDocs

getMaxMergeDocs( ) : integer

Retrieve index maxMergeDocs option

maxMergeDocs is a largest number of documents ever merged by addDocument(). Small values (e.g., less than 10,000) are best for interactive indexing, as this limits the length of pauses while indexing to a few seconds. Larger values are best for batched indexing and speedier searches.

Default value is PHP_INT_MAX

Output
integer
Details
visibility
public
final
false
static
false

getMergeFactor

getMergeFactor( ) : integer

Retrieve index mergeFactor option

mergeFactor determines how often segment indices are merged by addDocument(). With smaller values, less RAM is used while indexing, and searches on unoptimized indices are faster, but indexing speed is slower. With larger values, more RAM is used during indexing, and while searches on unoptimized indices are slower, indexing is faster. Thus larger values (> 10) are best for batch index creation, and smaller values (< 10) for indices that are interactively maintained.

Default value is 10

Output
integer
Details
visibility
public
final
false
static
false

getResultSetLimit

getResultSetLimit( ) : integer

Get result set limit.

0 means no limit

Output
integer
Details
visibility
public
final
false
static
true

getSegmentFileName

getSegmentFileName( integer $generation ) : string

Get segments file name

Arguments
$generation
integer
Output
string
Details
visibility
public
final
false
static
true

getSimilarity

getSimilarity( ) : Zend_Search_Lucene_Search_Similarity

Retrive similarity used by index reader

Details
visibility
public
final
false
static
false

getTermsPerQueryLimit

getTermsPerQueryLimit( ) : integer

Get result set limit.

0 (default) means no limit

Output
integer
Details
visibility
public
final
false
static
true

hasDeletions

hasDeletions( ) : boolean

Returns true if any documents have been deleted from this index.

Output
boolean
Details
visibility
public
final
false
static
false

hasTerm

hasTerm( Zend_Search_Lucene_Index_Term $term ) : boolean

Returns true if index contain documents with specified term.

Is used for query optimization.

Arguments
$term
Zend_Search_Lucene_Index_Term
Output
boolean
Details
visibility
public
final
false
static
false

isDeleted

isDeleted( integer $id ) : boolean

Checks, that document is deleted

Arguments
$id
integer
Output
boolean
Details
visibility
public
final
false
static
false
throws
Exception is thrown if $id is out of the range

maxDoc

maxDoc( ) : integer

Returns one greater than the largest possible document number.

This may be used to, e.g., determine how big to allocate a structure which will have an element for every document number in an index.

Output
integer
Details
visibility
public
final
false
static
false

nextTerm

nextTerm( ) : Zend_Search_Lucene_Index_Term|null

Scans terms dictionary and returns next term

Details
visibility
public
final
false
static
false

norm

norm( integer $id, string $fieldName ) : float

Returns a normalization factor for "field, document" pair.

Arguments
$id
integer
$fieldName
string
Output
float
Details
visibility
public
final
false
static
false

numDocs

numDocs( ) : integer

Returns the total number of non-deleted documents in this index.

Output
integer
Details
visibility
public
final
false
static
false

open

open( mixed $directory ) : Zend_Search_Lucene_Interface

Open index

Arguments
$directory
mixed
Details
visibility
public
final
false
static
true

optimize

optimize( ) :

Optimize index.

Merges all segments into one

Details
visibility
public
final
false
static
false

removeReference

removeReference( ) :

Remove reference from the index object

When reference count becomes zero, index is closed and resources are cleaned up

Details
visibility
public
final
false
static
false
internal

resetTermsStream

resetTermsStream( ) :

Reset terms stream.

Details
visibility
public
final
false
static
false

setDefaultSearchField

setDefaultSearchField( string $fieldName ) :

Set default search field.

Null means, that search is performed through all fields by default

Default value is null

Arguments
$fieldName
string
Details
visibility
public
final
false
static
true

setFormatVersion

setFormatVersion( int $formatVersion ) :

Set index format version.

Index is converted to this format at the nearest upfdate time

Arguments
$formatVersion
int
Details
visibility
public
final
false
static
false
throws

setMaxBufferedDocs

setMaxBufferedDocs( integer $maxBufferedDocs ) :

Set index maxBufferedDocs option

maxBufferedDocs is a minimal number of documents required before the buffered in-memory documents are written into a new Segment

Default value is 10

Arguments
$maxBufferedDocs
integer
Details
visibility
public
final
false
static
false

setMaxMergeDocs

setMaxMergeDocs( integer $maxMergeDocs ) :

Set index maxMergeDocs option

maxMergeDocs is a largest number of documents ever merged by addDocument(). Small values (e.g., less than 10,000) are best for interactive indexing, as this limits the length of pauses while indexing to a few seconds. Larger values are best for batched indexing and speedier searches.

Default value is PHP_INT_MAX

Arguments
$maxMergeDocs
integer
Details
visibility
public
final
false
static
false

setMergeFactor

setMergeFactor(  $mergeFactor ) :

Set index mergeFactor option

mergeFactor determines how often segment indices are merged by addDocument(). With smaller values, less RAM is used while indexing, and searches on unoptimized indices are faster, but indexing speed is slower. With larger values, more RAM is used during indexing, and while searches on unoptimized indices are slower, indexing is faster. Thus larger values (> 10) are best for batch index creation, and smaller values (< 10) for indices that are interactively maintained.

Default value is 10

Arguments
$mergeFactor
Details
visibility
public
final
false
static
false

setResultSetLimit

setResultSetLimit( integer $limit ) :

Set result set limit.

0 (default) means no limit

Arguments
$limit
integer
Details
visibility
public
final
false
static
true

setTermsPerQueryLimit

setTermsPerQueryLimit( integer $limit ) :

Set terms per query limit.

0 means no limit

Arguments
$limit
integer
Details
visibility
public
final
false
static
true

skipTo

skipTo( Zend_Search_Lucene_Index_Term $prefix ) :

Skip terms stream up to the specified term preffix.

Prefix contains fully specified field info and portion of searched term

Arguments
$prefix
Zend_Search_Lucene_Index_Term
Details
visibility
public
final
false
static
false

termDocs

termDocs( Zend_Search_Lucene_Index_Term $term, Zend_Search_Lucene_Index_DocsFilter|null $docsFilter = null ) : array

Returns IDs of all documents containing term.

Arguments
$term
Zend_Search_Lucene_Index_Term
$docsFilter
Zend_Search_Lucene_Index_DocsFilternull
Output
array
Details
visibility
public
final
false
static
false

termDocsFilter

termDocsFilter( Zend_Search_Lucene_Index_Term $term, Zend_Search_Lucene_Index_DocsFilter|null $docsFilter = null ) : Zend_Search_Lucene_Index_DocsFilter

Returns documents filter for all documents containing term.

It performs the same operation as termDocs, but return result as Zend_Search_Lucene_Index_DocsFilter object

Arguments
$term
Zend_Search_Lucene_Index_Term
$docsFilter
Zend_Search_Lucene_Index_DocsFilternull
Details
visibility
public
final
false
static
false

termFreqs

termFreqs( Zend_Search_Lucene_Index_Term $term, Zend_Search_Lucene_Index_DocsFilter|null $docsFilter = null ) : integer

Returns an array of all term freqs.

Result array structure: array(docId => freq, ...)

Arguments
$term
Zend_Search_Lucene_Index_Term
$docsFilter
Zend_Search_Lucene_Index_DocsFilternull
Output
integer
Details
visibility
public
final
false
static
false

termPositions

termPositions( Zend_Search_Lucene_Index_Term $term, Zend_Search_Lucene_Index_DocsFilter|null $docsFilter = null ) : array

Returns an array of all term positions in the documents.

Result array structure: array(docId => array(pos1, pos2, ...), ...)

Arguments
$term
Zend_Search_Lucene_Index_Term
$docsFilter
Zend_Search_Lucene_Index_DocsFilternull
Output
array
Details
visibility
public
final
false
static
false

terms

terms( ) : array

Returns an array of all terms in this index.

Output
array
Details
visibility
public
final
false
static
false

undeleteAll

undeleteAll( ) :

Undeletes all documents currently marked as deleted in this index.

Details
visibility
public
final
false
static
false
todo
Implementation
Documentation was generated by DocBlox.