Example usage for opennlp.tools.doccat DocumentCategorizerME train

List of usage examples for opennlp.tools.doccat DocumentCategorizerME train

Introduction

In this page you can find the example usage for opennlp.tools.doccat DocumentCategorizerME train.

Prototype

public static DoccatModel train(String languageCode, ObjectStream<DocumentSample> samples,
            TrainingParameters mlParams, DoccatFactory factory) throws IOException 

Source Link

Usage

From source file:io.learningbox.controller.APIController.java

@RequestMapping(value = "/categorize/{area}", method = RequestMethod.POST)
public SortedMap<Double, Set<String>> categorize(@PathVariable final String area, @RequestBody String input)
        throws IOException {
    List<LearningSet> l = repository.findByArea(area);
    final Iterator<LearningSet> sets = l.iterator();

    ObjectStream<DocumentSample> stream = new ObjectStream<DocumentSample>() {

        @Override/*  w  w  w  .j a v  a 2 s . c  om*/
        public DocumentSample read() throws IOException {
            if (sets.hasNext()) {
                LearningSet s = sets.next();

                return new DocumentSample(s.getCategory(), s.getText());
            }
            return null;
        }

        @Override
        public void reset() throws IOException, UnsupportedOperationException {
            throw new UnsupportedOperationException();
        }

        @Override
        public void close() throws IOException {
            //Do nothing
        }
    };

    TrainingParameters trainingParameters = TrainingParameters.defaultParams();
    trainingParameters.put(TrainingParameters.ITERATIONS_PARAM, Integer.toString(1000));
    trainingParameters.put(TrainingParameters.CUTOFF_PARAM, Integer.toString(1));

    DoccatModel model = DocumentCategorizerME.train("en", stream, trainingParameters, new DoccatFactory());
    DocumentCategorizerME myCategorizer = new DocumentCategorizerME(model);
    return myCategorizer.sortedScoreMap(input);
}