Example usage for edu.stanford.nlp.process WordToSentenceProcessor WordToSentenceProcessor

List of usage examples for edu.stanford.nlp.process WordToSentenceProcessor WordToSentenceProcessor

Introduction

In this page you can find the example usage for edu.stanford.nlp.process WordToSentenceProcessor WordToSentenceProcessor.

Prototype

public WordToSentenceProcessor(Set<String> boundaryToDiscard) 

Source Link

Document

Set the set of Strings that will mark the end of a sentence, and which will be discarded after doing so.

Usage

From source file:org.exist.xquery.corenlp.Tokenize.java

License:Open Source License

private void tokenizeString(String text, final OutDocType outputFormat) {
    PTBTokenizer<CoreLabel> tokenizer = PTBTokenizer.newPTBTokenizer(new StringReader(text), tokenizeNLs, true);
    cachedTokenizer = tokenizer;/*from   w  w  w  .j  a  v  a 2  s.  c  o m*/
    List<CoreLabel> tokens = tokenizer.tokenize();
    List<List<CoreLabel>> sentences = new WordToSentenceProcessor(
            WordToSentenceProcessor.NewlineIsSentenceBreak.TWO_CONSECUTIVE).wordsToSentences(tokens);
    createSpreadsheet(sentences, tokens, outputFormat);
}