Example usage for org.apache.lucene.analysis.pattern PatternTokenizer PatternTokenizer

List of usage examples for org.apache.lucene.analysis.pattern PatternTokenizer PatternTokenizer

Introduction

In this page you can find the example usage for org.apache.lucene.analysis.pattern PatternTokenizer PatternTokenizer.

Prototype

public PatternTokenizer(AttributeFactory factory, Pattern pattern, int group) 

Source Link

Document

creates a new PatternTokenizer returning tokens from group (-1 for split functionality)

Usage

From source file:capisco.lucene.CustomAnalyzer.java

License:Apache License

@Override
protected TokenStreamComponents createComponents(final String fieldName, final Reader reader) {

    final PatternTokenizer pat = new PatternTokenizer(reader, Pattern.compile("\\|"), -1);
    //final StandardTokenizer src = new StandardTokenizer(getVersion(), reader);
    //src.setMaxTokenLength(maxTokenLength);
    TokenStream tok = new StandardFilter(getVersion(), pat);
    tok = new LowerCaseFilter(getVersion(), tok);
    tok = new StopFilter(getVersion(), tok, stopwords);
    return new TokenStreamComponents(pat, tok) {
        @Override/*from w w  w.  j  av  a2s  . c o m*/
        protected void setReader(final Reader reader) throws IOException {
            //pat.setMaxTokenLength(CustomAnalyzer.this.maxTokenLength);
            super.setReader(reader);
        }
    };
}

From source file:org.elasticsearch.index.analysis.PatternTokenizerFactory.java

License:Apache License

@Override
public Tokenizer create(Reader reader) {
    return new PatternTokenizer(reader, pattern, group);
}