Java org.apache.lucene.analysis.miscellaneous WordDelimiterGraphFilter fields, constructors, methods, implement or subclass

Example usage for Java org.apache.lucene.analysis.miscellaneous WordDelimiterGraphFilter fields, constructors, methods, implement or subclass

Introduction

In this page you can find the methods, fields and constructors for org.apache.lucene.analysis.miscellaneous WordDelimiterGraphFilter.

The text is from its open source code.

Field

intGENERATE_WORD_PARTS
Causes parts of words to be generated:

"PowerShot" => "Power" "Shot"

intGENERATE_NUMBER_PARTS
Causes number subwords to be generated:

"500-42" => "500" "42"

intCATENATE_WORDS
Causes maximum runs of word parts to be catenated:

"wi-fi" => "wifi"

intCATENATE_NUMBERS
Causes maximum runs of number parts to be catenated:

"500-42" => "50042"

intCATENATE_ALL
Causes all subword parts to be catenated:

"wi-fi-4000" => "wifi4000"

intPRESERVE_ORIGINAL
Causes original words are preserved and added to the subword list (Defaults to false)

"500-42" => "500" "42" "500-42"

intSPLIT_ON_CASE_CHANGE
Causes lowercase -> uppercase transition to start a new subword.
intSPLIT_ON_NUMERICS
If not set, causes numeric changes to be ignored (subwords will only be generated given SUBWORD_DELIM tokens).
intSTEM_ENGLISH_POSSESSIVE
Causes trailing "'s" to be removed for each subword

"O'Neil's" => "O", "Neil"

Constructor

WordDelimiterGraphFilter(TokenStream in, int configurationFlags, CharArraySet protWords)
Creates a new WordDelimiterGraphFilter using WordDelimiterIterator#DEFAULT_WORD_DELIM_TABLE as its charTypeTable