Example usage for edu.stanford.nlp.ie.machinereading.common SimpleTokenize tokenize

List of usage examples for edu.stanford.nlp.ie.machinereading.common SimpleTokenize tokenize

Introduction

In this page you can find the example usage for edu.stanford.nlp.ie.machinereading.common SimpleTokenize tokenize.

Prototype

public static ArrayList<String> tokenize(String line) 

Source Link

Document

Basic string tokenization, skipping over white spaces

Usage

From source file:edu.washington.phrasal.feature.SentenceIdPhrasalVerbId.java

public SentenceIdPhrasalVerbId(Integer sentence_id, Integer phrasal_verb_id, FeatureGenerator fg) {
    this.sentence_id = sentence_id;
    this.sentence = fg.sentenceList.getSentence(this.sentence_id);
    this.phrasal_verb_id = phrasal_verb_id;
    this.fg = fg;
    this.sentenceTokens = SimpleTokenize.tokenize(sentence);
    this.phrasalVerbTokens = SimpleTokenize.tokenize(fg.getPhrasalVerbById(phrasal_verb_id));

    sentencePOS = tagWordTokens(sentenceTokens);
    phraseVerbPOS = tagWordTokens(phrasalVerbTokens);

    /* start index of phrase in sentence, first occurance */
    pvStartIndex = Collections.indexOfSubList(sentenceTokens, phrasalVerbTokens);
    /* end index of phrase in sentence */
    pvEndIndex = pvStartIndex + phrasalVerbTokens.size();

    /*Compute character based offsets*/
    characterOffsets = new ArrayList<Integer>();
    int offset = 0;
    for (String token : sentenceTokens) {
        characterOffsets.add(offset);//from www.  j a  v a2s.com
        offset = offset + token.length() + 1;
    }
}