|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
public interface TermToBytesRefAttribute
This attribute is requested by TermsHashPerField to index the contents. This attribute can be used to customize the final byte[] encoding of terms.
Consumers of this attribute call getBytesRef() up-front, and then
invoke fillBytesRef() for each term. Example:
final TermToBytesRefAttribute termAtt = tokenStream.getAttribute(TermToBytesRefAttribute.class);
final BytesRef bytes = termAtt.getBytesRef();
while (termAtt.incrementToken() {
// you must call termAtt.fillBytesRef() before doing something with the bytes.
// this encodes the term value (internally it might be a char[], etc) into the bytes.
int hashCode = termAtt.fillBytesRef();
if (isInteresting(bytes)) {
// because the bytes are reused by the attribute (like CharTermAttribute's char[] buffer),
// you should make a copy if you need persistent access to the bytes, otherwise they will
// be rewritten across calls to incrementToken()
doSomethingWith(new BytesRef(bytes));
}
}
...
CharTermAttributeImpl and its implementation of this method
for UTF-8 terms.| Method Summary | |
|---|---|
int |
fillBytesRef()
Updates the bytes getBytesRef() to contain this term's
final encoding, and returns its hashcode. |
BytesRef |
getBytesRef()
Retrieve this attribute's BytesRef. |
| Method Detail |
|---|
int fillBytesRef()
getBytesRef() to contain this term's
final encoding, and returns its hashcode.
BytesRef.hashCode():
int hash = 0;
for (int i = termBytes.offset; i < termBytes.offset+termBytes.length; i++) {
hash = 31*hash + termBytes.bytes[i];
}
Implement this for performance reasons, if your code can calculate
the hash on-the-fly. If this is not the case, just return
termBytes.hashCode().BytesRef getBytesRef()
fillBytesRef().
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||