org.apache.lucene.analysis.pt
Class RSLPStemmerBase

java.lang.Object
  extended by org.apache.lucene.analysis.pt.RSLPStemmerBase
Direct Known Subclasses:
GalicianMinimalStemmer, GalicianStemmer, PortugueseMinimalStemmer, PortugueseStemmer

public abstract class RSLPStemmerBase
extends Object

Base class for stemmers that use a set of RSLP-like stemming steps.

RSLP (Removedor de Sufixos da Lingua Portuguesa) is an algorithm designed originally for stemming the Portuguese language, described in the paper A Stemming Algorithm for the Portuguese Language, Orengo et. al.

Since this time a plural-only modification (RSLP-S) as well as a modification for the Galician language have been implemented. This class parses a configuration file that describes RSLPStemmerBase.Steps, where each Step contains a set of RSLPStemmerBase.Rules.

The general rule format is:

{ "suffix", N, "replacement", { "exception1", "exception2", ...}}
where:

A step is an ordered list of rules, with a structure in this format:

{ "name", N, B, { "cond1", "cond2", ... } ... rules ... };
where:

See Also:
RSLP description
NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.

Nested Class Summary
protected static class RSLPStemmerBase.Rule
          A basic rule, with no exceptions.
protected static class RSLPStemmerBase.RuleWithSetExceptions
          A rule with a set of whole-word exceptions.
protected static class RSLPStemmerBase.RuleWithSuffixExceptions
          A rule with a set of exceptional suffixes.
protected static class RSLPStemmerBase.Step
          A step containing a list of rules.
 
Constructor Summary
RSLPStemmerBase()
           
 
Method Summary
protected static Map<String,RSLPStemmerBase.Step> parse(Class<? extends RSLPStemmerBase> clazz, String resource)
          Parse a resource file into an RSLP stemmer description.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RSLPStemmerBase

public RSLPStemmerBase()
Method Detail

parse

protected static Map<String,RSLPStemmerBase.Step> parse(Class<? extends RSLPStemmerBase> clazz,
                                                        String resource)
Parse a resource file into an RSLP stemmer description.

Returns:
a Map containing the named Steps in this description.