Example usage for edu.stanford.nlp.ie AbstractSequenceClassifier classifyFile

List of usage examples for edu.stanford.nlp.ie AbstractSequenceClassifier classifyFile

Introduction

In this page you can find the example usage for edu.stanford.nlp.ie AbstractSequenceClassifier classifyFile.

Prototype

public List<List<IN>> classifyFile(String filename) 

Source Link

Document

Classify the contents of a file.

Usage

From source file:edu.rpi.tw.linkipedia.search.test.MyNLP.java

License:Open Source License

public static void main(String[] args) throws IOException {

    String serializedClassifier = "classifiers/english.all.3class.distsim.crf.ser.gz";

    if (args.length > 0) {
        serializedClassifier = args[0];/*from   w w  w.j a va 2  s.c om*/
    }

    AbstractSequenceClassifier<CoreLabel> classifier = CRFClassifier
            .getClassifierNoExceptions(serializedClassifier);

    if (args.length > 1) {
        String fileContents = IOUtils.slurpFile(args[1]);
        List<List<CoreLabel>> out = classifier.classify(fileContents);
        for (List<CoreLabel> sentence : out) {
            for (CoreLabel word : sentence) {
                System.out.print(word.word() + '/' + word.get(CoreAnnotations.AnswerAnnotation.class) + ' ');
            }
            System.out.println();
        }
        out = classifier.classifyFile(args[1]);
        for (List<CoreLabel> sentence : out) {
            for (CoreLabel word : sentence) {
                System.out.print(word.word() + '/' + word.get(CoreAnnotations.AnswerAnnotation.class) + ' ');
            }
            System.out.println();
        }

    } else {
        String s1 = "AT&T has a cute TV commercial for its U-verse TV app in which the co-pilot of a space capsule rattling through a difficult re-entry tells the pilot, This is very exciting! But I'm at my stop. The flummoxed pilot responds, Come again? and the co-pilot explains, I'm watching this on the train. It's so hard to leave. Good luck with everything! The co-pilot, now dressed in business attire, emerges from a train holding the smartphone he was using to watch the movie while on his way to work. The key is that he was holding a phone. Why not a tablet, which would seem the logical choice for watching movies while on the go? The commercial underscores what the market is telling us - that the tablet - a product introduced by Apple barely four years ago - may be a high-tech hula hoop. Some industry watchers think tablets have a bright future. Research company Gartner said last week that it expects tablet sales to eclipse PC sales in 2015 and predicted that tablet sales will grow by 25 percent from this year to next. Tablets could indeed overtake PCs, which have suffered from lackluster sales for years. But the tablet market itself appears to be flattening out, as evidenced by recent reports from big tablet makers. Apple, the market leader, reported a decline in tablet shipments in its latest fiscal quarter, while No. 2 Samsung called its tablet sales sluggish.";
        String s2 = "Mississippi, has Hillary's Time Come? I read a fascinating piece in the WSJ today. The author spoke about how Obama's approval ratings are at a dismal low. He apparently appears to be unable to pull Congeress together to accomplish much, and in the author's opinion, it would only get worse if he got elected to a second term. He would like Obama to bow out for a 2nd term, and have Hillary Clinton run for president. He believes that Hillary has the best chance of bringing the country together. I don't know how she feels about running, but, IMO, I think that it is a great idea. She certainly has the smarts, the experience, and the approval of the majority of citizens. I think that the people on the Republican side of the fence, are a poor bunch. There is not a one in the bunch who would capture the imagination of the American people. The voters who will decide this election, IMO are the moderates and the independents. THEY certainly will not vote for any of the anti-gay, pro-life bible bangers. What do you think? Should Obama step down for the good of the country? Should Hillary run? @Phoenix32890, Hillary's time has come, and gone... She would make her self no friend of the Democratic party by dividing them and losing the black vote... No one is going to vote for her that did not vote for Mr. Obama, and a whole lot less Obama voters will vote for her... Don't worry though.. She has enough sense to not try to run against a sitting president from her own party... @Phoenix32890, The voters who will decide this election, IMO are the moderates and the independents. THEY certainly will not vote for any of the anti-gay, pro-life bible bangers. Not sure how you determined this factoid. Clearly most who vote for Democratic candidates would follow this voting pattern but not any Tea Baggers or typical Republicans (even those who claim the moderate social mantel). In the next election with record lows in voting, the most... passionate voters will vote their allegedly principled candidates in. What does that mean? Independents and so-called moderates WILL LIKELY not be turning up to the polling booths in any large numbers to authenticate your thesis. @tsarstepan, Best case scenario: Obama gets reelected and the country continues to slide backwards even if the Democrats gain a greater foothold of the US Senate and increase their collective mass in the US Congress. Worst case scenario: George W. Bush II (any Republican candidate other then Huntsman, Romney, or Ron Paul) will take the election. Then that means the Republicans would likely further gain in Congress and even the Senate perhaps. Then the country slides right back into the deepest of recessions. @tsarstepan, I'm not saying I don't like the idea. The theory behind a Hilary Clinton candidacy would a great improvement over a stalled and uninspiring Obama presidency. I'm not too enthusiastic over it actually be plausible even in the best of scenarios. @tsarstepan, I suspect you are wrong on the voting inclinations of the moderates, and especially the ones you call tea baggers. Actually, I think you are wrong on the moderate/independent voter turnout, too, but I could be wrong. I'm expecting a fairly good anti Obama turnout amongst that group. I could accept Hillary. Like Phoenix, I'm not greatly impressed by the Republican field. Not crazy about the presumed Democratic candidate either. On the other hand, Obama stepping aside seems even less likely than Cain giving up. I guess that in summary, I'm not optimistic about the future of the country for the next five years.";
        //System.out.println(classifier.classifyToString(s1));
        String content = classifier.classifyWithInlineXML(s2);
        System.out.println(content);
        ArrayList<String> mentions = getMentions(content);
        int currentPosition = 0;
        ArrayList<Mention> mention_list = new ArrayList<Mention>();
        //String context = "";

        for (int i = 0; i < mentions.size(); i++) {
            String mention = mentions.get(i);
            Mention current_mention = new Mention();
            int start = s2.indexOf(mention);
            int end = start + mention.length();
            int globalStart = start + currentPosition;
            int globalEnd = end + currentPosition;
            //System.out.println(mention+": "+globalStart+" "+globalEnd);
            if (end >= s2.length())
                break;
            s2 = s2.substring(end);
            currentPosition = globalEnd;
            current_mention.setMention(mention, globalStart, globalEnd);

            int contextStart = i - 3;
            int contextEnd = i + 3;

            if (contextStart < 0)
                contextStart = 0;
            if (contextEnd > mentions.size())
                contextEnd = mentions.size();

            for (int j = contextStart; j < contextEnd; j++) {
                if (i != j)
                    current_mention.addContext(mentions.get(j));
            }
            mention_list.add(current_mention);
            //System.out.println(currentPosition);
        }

        int queryId = 0;
        String DocId = "DF-199-193696-586_5767";
        for (Mention m : mention_list) {
            String myQeuryIdString = getQueryId(queryId);
            System.out.println("EL14_ENG_" + myQeuryIdString + "," + DocId + "," + m);
            queryId++;
        }

        //System.out.println(classifier.classifyToString(s2, "xml", true));
        //              int i=0;
        //              for (List<CoreLabel> lcl : classifier.classify(s2)) {
        //                 System.out.println(lcl);
        //                for (CoreLabel cl : lcl) {
        //                  //System.out.println(i++ + ":");
        //                  System.out.println(cl+" "+cl.category()+" "+cl.value());
        //                }
        //              }
        //            }
    }
}

From source file:fire.NERDemo.java

public static void main(String[] args) throws Exception {

    String serializedClassifier = "C:\\Users\\DIPANAKR\\Desktop\\Satanu\\fire\\stanford-ner-2015-04-20\\stanford-ner-2015-04-20\\classifiers\\english.all.3class.distsim.crf.ser.gz";

    if (args.length > 0) {
        serializedClassifier = args[0];//from   w  ww  .j a  v a  2  s.  co m
    }

    AbstractSequenceClassifier<CoreLabel> classifier = CRFClassifier.getClassifier(serializedClassifier);

    /* For either a file to annotate or for the hardcoded text example, this
       demo file shows several ways to process the input, for teaching purposes.
    */

    if (args.length > 1) {

        /* For the file, it shows (1) how to run NER on a String, (2) how
           to get the entities in the String with character offsets, and
           (3) how to run NER on a whole file (without loading it into a String).
        */

        String fileContents = IOUtils.slurpFile(args[1]);
        List<List<CoreLabel>> out = classifier.classify(fileContents);
        for (List<CoreLabel> sentence : out) {
            for (CoreLabel word : sentence) {
                System.out.print(word.word() + '/' + word.get(CoreAnnotations.AnswerAnnotation.class) + ' ');
            }
            System.out.println();
        }

        System.out.println("---");
        out = classifier.classifyFile(args[1]);
        for (List<CoreLabel> sentence : out) {
            for (CoreLabel word : sentence) {
                System.out.print(word.word() + '/' + word.get(CoreAnnotations.AnswerAnnotation.class) + ' ');
            }
            System.out.println();
        }

        System.out.println("---");
        List<Triple<String, Integer, Integer>> list = classifier.classifyToCharacterOffsets(fileContents);
        for (Triple<String, Integer, Integer> item : list) {
            System.out.println(item.first() + ": " + fileContents.substring(item.second(), item.third()));
        }
        System.out.println("---");
        System.out.println("Ten best entity labelings");
        DocumentReaderAndWriter<CoreLabel> readerAndWriter = classifier.makePlainTextReaderAndWriter();
        classifier.classifyAndWriteAnswersKBest(args[1], 10, readerAndWriter);

        System.out.println("---");
        System.out.println("Per-token marginalized probabilities");
        classifier.printProbs(args[1], readerAndWriter);

        // -- This code prints out the first order (token pair) clique probabilities.
        // -- But that output is a bit overwhelming, so we leave it commented out by default.
        // System.out.println("---");
        // System.out.println("First Order Clique Probabilities");
        // ((CRFClassifier) classifier).printFirstOrderProbs(args[1], readerAndWriter);

    } else {

        /* For the hard-coded String, it shows how to run it on a single
           sentence, and how to do this and produce several formats, including
           slash tags and an inline XML output format. It also shows the full
           contents of the {@code CoreLabel}s that are constructed by the
           classifier. And it shows getting out the probabilities of different
           assignments and an n-best list of classifications with probabilities.
        */

        String[] example = { "Good afternoon Rajat Raina, how are you today?",
                "I go to school at Stanford University, which is located in California." };
        for (String str : example) {
            System.out.println(classifier.classifyToString(str));
        }
        System.out.println("---");

        for (String str : example) {
            // This one puts in spaces and newlines between tokens, so just print not println.
            System.out.print(classifier.classifyToString(str, "slashTags", false));
        }
        System.out.println("---");

        for (String str : example) {
            // This one is best for dealing with the output as a TSV (tab-separated column) file.
            // The first column gives entities, the second their classes, and the third the remaining text in a document
            System.out.print(classifier.classifyToString(str, "tabbedEntities", false));
        }
        System.out.println("---");

        for (String str : example) {
            System.out.println(classifier.classifyWithInlineXML(str));
        }
        System.out.println("---");

        for (String str : example) {
            System.out.println(classifier.classifyToString(str, "xml", true));
        }
        System.out.println("---");

        for (String str : example) {
            System.out.print(classifier.classifyToString(str, "tsv", false));
        }
        System.out.println("---");

        // This gets out entities with character offsets
        int j = 0;
        for (String str : example) {
            j++;
            List<Triple<String, Integer, Integer>> triples = classifier.classifyToCharacterOffsets(str);
            for (Triple<String, Integer, Integer> trip : triples) {
                System.out.printf("%s over character offsets [%d, %d) in sentence %d.%n", trip.first(),
                        trip.second(), trip.third, j);
            }
        }
        System.out.println("---");

        // This prints out all the details of what is stored for each token
        int i = 0;
        for (String str : example) {
            for (List<CoreLabel> lcl : classifier.classify(str)) {
                for (CoreLabel cl : lcl) {
                    System.out.print(i++ + ": ");
                    System.out.println(cl.toShorterString());
                }
            }
        }

        System.out.println("---");

    }
}

From source file:graphene.augment.snlp.NERDemo.java

License:Apache License

public static void main(String[] args) throws Exception {

    String serializedClassifier = "src/main/resources/edu/stanford/nlp/models/ner/english.all.3class.caseless.distsim.crf.ser.gz";

    if (args.length > 0) {
        serializedClassifier = args[0];//from   www  .j a  v  a2 s.  c o  m
    }

    AbstractSequenceClassifier<CoreLabel> classifier = CRFClassifier.getClassifier(serializedClassifier);

    /*
     * For either a file to annotate or for the hardcoded text example, this
     * demo file shows two ways to process the output, for teaching
     * purposes. For the file, it shows both how to run NER on a String and
     * how to run it on a whole file. For the hard-coded String, it shows
     * how to run it on a single sentence, and how to do this and produce an
     * inline XML output format.
     */
    if (args.length > 1) {
        String fileContents = IOUtils.slurpFile(args[1]);
        List<List<CoreLabel>> out = classifier.classify(fileContents);
        for (List<CoreLabel> sentence : out) {
            for (CoreLabel word : sentence) {
                System.out.print(word.word() + '/' + word.get(CoreAnnotations.AnswerAnnotation.class) + ' ');
            }
            System.out.println();
        }
        System.out.println("---");
        out = classifier.classifyFile(args[1]);
        for (List<CoreLabel> sentence : out) {
            for (CoreLabel word : sentence) {
                System.out.print(word.word() + '/' + word.get(CoreAnnotations.AnswerAnnotation.class) + ' ');
            }
            System.out.println();
        }

    } else {
        String[] example = { "Good afternoon Rajat Raina, how are you today?",
                "I go to school at Stanford University, which is located in California." };
        for (String str : example) {
            System.out.println(classifier.classifyToString(str));
        }
        System.out.println("---");

        for (String str : example) {
            // This one puts in spaces and newlines between tokens, so just
            // print not println.
            System.out.print(classifier.classifyToString(str, "slashTags", false));
        }
        System.out.println("---");

        for (String str : example) {
            System.out.println(classifier.classifyWithInlineXML(str));
        }
        System.out.println("---");

        for (String str : example) {
            System.out.println(classifier.classifyToString(str, "xml", true));
        }
        System.out.println("---");

        int i = 0;
        for (String str : example) {
            for (List<CoreLabel> lcl : classifier.classify(str)) {
                for (CoreLabel cl : lcl) {
                    System.out.print(i++ + ": ");
                    System.out.println(cl.toShorterString());
                }
            }
        }
    }
}

From source file:nlidb.NLIDB.java

public static void main(String[] args) throws Exception {

    //String serializedClassifier = "classifiers/english.all.3class.distsim.crf.ser.gz";
    //String serializedClassifier = "nlidbtraining.ser.gz";
    String serializedClassifier = "classifiers/ner-phd-test.ser.gz";

    if (args.length > 0) {
        serializedClassifier = args[0];/*from   w  ww . ja v a 2s. c om*/
    }

    AbstractSequenceClassifier<CoreLabel> classifier = CRFClassifier.getClassifier(serializedClassifier);

    /* For either a file to annotate or for the hardcoded text example,
       this demo file shows two ways to process the output, for teaching
       purposes.  For the file, it shows both how to run NER on a String
       and how to run it on a whole file.  For the hard-coded String,
       it shows how to run it on a single sentence, and how to do this
       and produce an inline XML output format.
    */
    if (args.length > 1) {
        String fileContents = IOUtils.slurpFile(args[1]);
        List<List<CoreLabel>> out = classifier.classify(fileContents);
        for (List<CoreLabel> sentence : out) {
            for (CoreLabel word : sentence) {
                System.out.print(word.word() + '/' + word.get(CoreAnnotations.AnswerAnnotation.class) + ' ');
            }
            System.out.println();
        }
        System.out.println("---");
        out = classifier.classifyFile(args[1]);
        for (List<CoreLabel> sentence : out) {
            for (CoreLabel word : sentence) {
                System.out.print(word.word() + '/' + word.get(CoreAnnotations.AnswerAnnotation.class) + ' ');
            }
            System.out.println();
        }

    } else {
        //String[] example = {"Good afternoon Rajat Raina, how are you today?",
        //                    "I go to school at Stanford University, which is located in California." };

        String[] example = { "which customer has the postcode CM13" };
        for (String str : example) {
            System.out.println(classifier.classifyToString(str));
        }
        System.out.println("---");

        for (String str : example) {
            // This one puts in spaces and newlines between tokens, so just print not println.
            System.out.print(classifier.classifyToString(str, "slashTags", false));
        }
        System.out.println("---");

        for (String str : example) {
            System.out.println(classifier.classifyWithInlineXML(str));
        }
        System.out.println("---");

        for (String str : example) {
            System.out.println(classifier.classifyToString(str, "xml", true));
        }
        System.out.println("---");

        int i = 0;
        for (String str : example) {
            for (List<CoreLabel> lcl : classifier.classify(str)) {
                for (CoreLabel cl : lcl) {
                    System.out.print(i++ + ": ");
                    System.out.println(cl.toShorterString());
                }
            }
        }
    }
}