Example usage for Java org.apache.mahout.vectorizer DocumentProcessor fields, constructors, methods, implement or subclass
The text is from its open source code.
String | TOKENIZED_DOCUMENT_OUTPUT_FOLDER |
String | ANALYZER_CLASS |
void | tokenizeDocuments(Path input, Class extends Analyzer> analyzerClass, Path output, Configuration baseConf) Convert the input documents into token array using the StringTuple The input documents has to be in the org.apache.hadoop.io.SequenceFile format |