Zend Framework
LICENSE
This source file is subject to the new BSD license that is bundled with this package in the file LICENSE.txt. It is also available through the world-wide-web at this URL: http://framework.zend.com/license/new-bsd If you did not receive a copy of the license and are unable to obtain it through the world-wide-web, please send an email to license@zend.com so we can send you a copy immediately.
An Analyzer is used to analyze text.
It thus represents a policy for extracting index terms from text.
Note: Lucene Java implementation is oriented to streams. It provides effective work with a huge documents (more then 20Mb). But engine itself is not oriented such documents. Thus Zend_Search_Lucene analysis API works with data strings and sets (arrays).
Zend_Search_Lucene_Analysis_Analyzer $_defaultImpl = ''
The Analyzer implementation used by default.
string $_encoding = ''
Input string encoding
string $_input = 'null'
Input string
getDefault(
)
:
Zend_Search_Lucene_Analysis_Analyzer
Return the default Analyzer implementation used by indexing code.
nextToken(
)
:
Zend_Search_Lucene_Analysis_Token|null
Tokenization stream API Get next token Returns null at the end of stream
Tokens are returned in UTF-8 (internal Zend_Search_Lucene encoding)
reset(
)
:
Reset token stream
setDefault(
$analyzer
)
:
Set the default Analyzer implementation used by indexing code.
setInput(
string $data, $encoding
)
:
Tokenization stream API Set input
tokenize(
string $data, $encoding
)
:
array
Tokenize text to a terms Returns array of Zend_Search_Lucene_Analysis_Token objects
Tokens are returned in UTF-8 (internal Zend_Search_Lucene encoding)