List of usage examples for org.apache.lucene.analysis.cn.smart WordType NUMBER
int NUMBER
To view the source code for org.apache.lucene.analysis.cn.smart WordType NUMBER.
Click Source Link
From source file:com.churvey.graduate.chinese.WordSegmenter.java
License:Apache License
/** * Process a {@link SegToken} so that it is ready for indexing. * //from ww w. j av a 2s .co m * This method calculates offsets and normalizes the token with {@link SegTokenFilter}. * * @param st input {@link SegToken} * @param sentence associated Sentence * @param sentenceStartOffset offset into sentence * @return Lucene {@link SegToken} */ public SegToken convertSegToken(SegToken st, String sentence, int sentenceStartOffset) { switch (st.wordType) { case WordType.STRING: case WordType.NUMBER: case WordType.FULLWIDTH_NUMBER: case WordType.FULLWIDTH_STRING: st.charArray = sentence.substring(st.startOffset, st.endOffset).toCharArray(); break; default: break; } st = tokenFilter.filter(st); st.startOffset += sentenceStartOffset; st.endOffset += sentenceStartOffset; return st; }