Java edu.stanford.nlp.process PTBTokenizer fields, constructors, methods, implement or subclass

Example usage for Java edu.stanford.nlp.process PTBTokenizer fields, constructors, methods, implement or subclass

Introduction

In this page you can find the methods, fields and constructors for edu.stanford.nlp.process PTBTokenizer.

The text is from its open source code.

Constructor

PTBTokenizer(final Reader r, final LexedTokenFactory tokenFactory, final String options)
Constructs a new PTBTokenizer with a custom LexedTokenFactory.

Method

TokenizerFactoryfactory()
This is a historical constructor that returns Word tokens.
TokenizerFactoryfactory(boolean tokenizeNLs, boolean invertible)
TokenizerFactoryfactory(LexedTokenFactory factory, String options)
Get a TokenizerFactory that does Penn Treebank tokenization.
StringgetNewlineToken()
Returns the string literal inserted for newlines when the -tokenizeNLs options is set.
PTBTokenizernewPTBTokenizer(Reader r)
Constructs a new PTBTokenizer that returns Word tokens and which treats carriage returns as normal whitespace.
PTBTokenizernewPTBTokenizer(Reader r, boolean tokenizeNLs, boolean invertible)
Constructs a new PTBTokenizer that makes CoreLabel tokens.
Stringptb2Text(String ptbText)
Returns a presentable version of the given PTB-tokenized text.
Stringptb2Text(List ptbWords)
Returns a presentable version of the given PTB-tokenized words.
StringptbToken2Text(String ptbText)
Returns a presentable version of a given PTB token.