Example usage for org.apache.lucene.analysis.ngram NGramTokenizer NGramTokenizer

List of usage examples for org.apache.lucene.analysis.ngram NGramTokenizer NGramTokenizer

Introduction

In this page you can find the example usage for org.apache.lucene.analysis.ngram NGramTokenizer NGramTokenizer.

Prototype

NGramTokenizer(AttributeFactory factory, int minGram, int maxGram, boolean edgesOnly) 

Source Link

Usage

From source file:NGramAnalyzer.java

@Override
protected TokenStreamComponents createComponents(String string, Reader reader) {
    TokenStream result = null;/*from w  ww  .  j av  a  2 s . com*/

    Tokenizer source = new NGramTokenizer(Version.LUCENE_46, reader, n, n);
    result = source;

    return new TokenStreamComponents(source, result);
}

From source file:org.elasticsearch.index.analysis.NGramTokenizerFactory.java

License:Apache License

@SuppressWarnings("deprecation")
@Override/*from  w ww .  jav  a 2  s  .co  m*/
public Tokenizer create(Reader reader) {
    if (version.onOrAfter(Version.LUCENE_43) && esVersion.onOrAfter(org.elasticsearch.Version.V_0_90_2)) {
        /*
         * We added this in 0.90.2 but 0.90.1 used LUCENE_43 already so we can not rely on the lucene version.
         * Yet if somebody uses 0.90.2 or higher with a prev. lucene version we should also use the deprecated version.
         */
        final Version version = this.version == Version.LUCENE_43 ? Version.LUCENE_44 : this.version; // always use 4.4 or higher
        if (matcher == null) {
            return new NGramTokenizer(version, reader, minGram, maxGram);
        } else {
            return new NGramTokenizer(version, reader, minGram, maxGram) {
                @Override
                protected boolean isTokenChar(int chr) {
                    return matcher.isTokenChar(chr);
                }
            };
        }
    } else {
        return new Lucene43NGramTokenizer(reader, minGram, maxGram);
    }
}