Example usage for opennlp.tools.doccat DoccatModel serialize

List of usage examples for opennlp.tools.doccat DoccatModel serialize

Introduction

In this page you can find the example usage for opennlp.tools.doccat DoccatModel serialize.

Prototype

@SuppressWarnings("unchecked")
public final void serialize(OutputStream out) throws IOException 

Source Link

Document

Serializes the model to the given OutputStream .

Usage

From source file:com.tamingtext.classifier.maxent.TrainMaxent.java

public void train(String source, String destination) throws IOException {
    //<start id="maxent.examples.train.setup"/> 
    File[] inputFiles = FileUtil.buildFileList(new File(source));
    File modelFile = new File(destination);

    Tokenizer tokenizer = SimpleTokenizer.INSTANCE; //<co id="tm.tok"/>
    CategoryDataStream ds = new CategoryDataStream(inputFiles, tokenizer);

    int cutoff = 5;
    int iterations = 100;
    NameFinderFeatureGenerator nffg //<co id="tm.fg"/>
            = new NameFinderFeatureGenerator();
    BagOfWordsFeatureGenerator bowfg = new BagOfWordsFeatureGenerator();

    DoccatModel model = DocumentCategorizerME.train("en", ds, cutoff, iterations, nffg, bowfg); //<co id="tm.train"/>
    model.serialize(new FileOutputStream(modelFile));

    /*<calloutlist>
    <callout arearefs="tm.tok">Create data stream</callout>
    <callout arearefs="tm.fg">Set up features generators</callout> 
    <callout arearefs="tm.train">Train categorizer</callout>  
    </calloutlist>*/// w w  w. ja  v a  2s .co  m
    //<end id="maxent.examples.train.setup"/>
}