Example usage for org.apache.lucene.analysis.util CharTokenizer subclass-usage

List of usage examples for org.apache.lucene.analysis.util CharTokenizer subclass-usage

Introduction

In this page you can find the example usage for org.apache.lucene.analysis.util CharTokenizer subclass-usage.

Usage

From source file cn.edu.scut.patent.ICTCLASAnalyzer.ICTCLASTokenizer.java

/**
 * WhiteSpaceTokenizer.java?
 */
public final class ICTCLASTokenizer extends CharTokenizer {

    /**

From source file com.b2international.index.analyzer.CharMatcherTokenizer.java

/**
 * A variant of {@link CharTokenizer} which splits tokens according to the specified {@link CharMatcher},
 * converting characters to lower case in the normalization step.
 */
public class CharMatcherTokenizer extends CharTokenizer {

From source file com.b2international.index.analyzer.DelimiterTokenizer.java

/**
 * A character-oriented tokenizer which splits tokens on whitespace and delimiters enumerated in
 * {@link IndexUtils#DELIMITERS}, and also converts characters to lower case in the normalization phase.
 * 
 */
public class DelimiterTokenizer extends CharTokenizer {

From source file com.berico.clavin.index.WhitespaceLowerCaseTokenizer.java

/**
 * LowerCaseTokenizer performs the function of WhitespaceTokenizer
 * and LowerCaseFilter together. It divides text at whitespace and
 * converts them to lower case. While it is functionally equivalent to
 * a combination of WhitespaceTokenizer and LowerCaseFilter, there is a
 * performance advantage to doing the two tasks at once, hence this

From source file com.berico.clavin.resolver.impl.lucene.WhitespaceLowerCaseTokenizer.java

/**
 * LowerCaseTokenizer performs the function of WhitespaceTokenizer
 * and LowerCaseFilter together. It divides text at whitespace and
 * converts them to lower case. While it is functionally equivalent to
 * a combination of WhitespaceTokenizer and LowerCaseFilter, there is a
 * performance advantage to doing the two tasks at once, hence this

From source file com.bericotech.clavin.index.WhitespaceLowerCaseTokenizer.java

/**
 * LowerCaseTokenizer performs the function of WhitespaceTokenizer
 * and LowerCaseFilter together. It divides text at whitespace and
 * converts them to lower case. While it is functionally equivalent to
 * a combination of WhitespaceTokenizer and LowerCaseFilter, there is a
 * performance advantage to doing the two tasks at once, hence this

From source file com.globalsight.ling.lucene.analysis.ru.RussianLetterTokenizer.java

/**
 * A RussianLetterTokenizer is a tokenizer that extends
 * LetterTokenizer by additionally looking up letters in a given
 * "russian charset". The problem with LeterTokenizer is that it uses
 * Character.isLetter() method, which doesn't know how to detect
 * letters in encodings like CP1252 and KOI8 (well-known problems with

From source file com.searchcode.app.util.CodeAnalyzer.java

final class CodeTokenizer extends CharTokenizer {
    public CodeTokenizer() {
    }

    public CodeTokenizer(AttributeFactory factory) {
        super(factory);

From source file de.uop.code.disambiguation.lucene.DoserStandardTokenizer.java

public final class DoserStandardTokenizer extends CharTokenizer {

    /**
     * Construct a new WhitespaceTokenizer. * @param matchVersion Lucene version
     * to match See {@link <a href="#version">above</a>}
     * 

From source file doser.lucene.analysis.DoserStandardTokenizer.java

public final class DoserStandardTokenizer extends CharTokenizer {

    /**
     * Construct a new WhitespaceTokenizer using a given
     * {@link org.apache.lucene.util.AttributeSource.AttributeFactory}.
     *