| 
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectcom.ibm.icu.text.BreakIterator
org.apache.lucene.analysis.icu.segmentation.LaoBreakIterator
public class LaoBreakIterator
Syllable iterator for Lao text.
This breaks Lao text into syllables according to: Syllabification of Lao Script for Line Breaking Phonpasit Phissamay, Valaxay Dalolay, Chitaphone Chanhsililath, Oulaiphone Silimasak, Sarmad Hussain, Nadir Durrani, Science Technology and Environment Agency, CRULP.
Most work is accomplished with RBBI rules, however some additional special logic is needed that cannot be coded in a grammar, and this is implemented here.
For example, what appears to be a final consonant might instead be part of the next syllable. Rules match in a greedy fashion, leaving an illegal sequence that matches no rules.
Take for instance the text ກວ່າດອກ The first rule greedily matches ກວ່າດ, but then ອກ is encountered, which is illegal. What LaoBreakIterator does, according to the paper:
Finally, LaoBreakIterator also takes care of the second concern mentioned in the paper. This is the issue of combining marks being in the wrong order (typos).
| Field Summary | 
|---|
| Fields inherited from class com.ibm.icu.text.BreakIterator | 
|---|
DONE, KIND_CHARACTER, KIND_LINE, KIND_SENTENCE, KIND_TITLE, KIND_WORD | 
| Constructor Summary | |
|---|---|
LaoBreakIterator(com.ibm.icu.text.RuleBasedBreakIterator rules)
Creates a new iterator, performing the backtracking verification across the provided rules. | 
|
| Method Summary | |
|---|---|
 LaoBreakIterator | 
clone()
Clone method.  | 
 int | 
current()
 | 
 int | 
first()
 | 
 int | 
following(int offset)
 | 
 CharacterIterator | 
getText()
 | 
 int | 
last()
 | 
 int | 
next()
 | 
 int | 
next(int n)
 | 
 int | 
previous()
 | 
 void | 
setText(CharacterIterator text)
 | 
 void | 
setText(String newText)
 | 
| Methods inherited from class com.ibm.icu.text.BreakIterator | 
|---|
getAvailableLocales, getAvailableULocales, getBreakInstance, getCharacterInstance, getCharacterInstance, getCharacterInstance, getLineInstance, getLineInstance, getLineInstance, getLocale, getSentenceInstance, getSentenceInstance, getSentenceInstance, getTitleInstance, getTitleInstance, getTitleInstance, getWordInstance, getWordInstance, getWordInstance, isBoundary, preceding, registerInstance, registerInstance, unregister | 
| Methods inherited from class java.lang.Object | 
|---|
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait | 
| Constructor Detail | 
|---|
public LaoBreakIterator(com.ibm.icu.text.RuleBasedBreakIterator rules)
rules.
| Method Detail | 
|---|
public int current()
current in class com.ibm.icu.text.BreakIteratorpublic int first()
first in class com.ibm.icu.text.BreakIteratorpublic int following(int offset)
following in class com.ibm.icu.text.BreakIteratorpublic CharacterIterator getText()
getText in class com.ibm.icu.text.BreakIteratorpublic int last()
last in class com.ibm.icu.text.BreakIteratorpublic int next()
next in class com.ibm.icu.text.BreakIteratorpublic int next(int n)
next in class com.ibm.icu.text.BreakIteratorpublic int previous()
previous in class com.ibm.icu.text.BreakIteratorpublic void setText(CharacterIterator text)
setText in class com.ibm.icu.text.BreakIteratorpublic void setText(String newText)
setText in class com.ibm.icu.text.BreakIteratorpublic LaoBreakIterator clone()
clone in class com.ibm.icu.text.BreakIterator
  | 
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||