Example usage for org.apache.poi.hwpf.extractor Word6Extractor getText

List of usage examples for org.apache.poi.hwpf.extractor Word6Extractor getText

Introduction

In this page you can find the example usage for org.apache.poi.hwpf.extractor Word6Extractor getText.

Prototype

public String getText() 

Source Link

Usage

From source file:com.jaeksoft.searchlib.parser.DocParser.java

License:Open Source License

private void oldWordExtraction(ParserResultItem result, InputStream inputStream) throws IOException {
    Word6Extractor word6 = null;
    try {/*from   ww  w.j av a  2 s  .  co m*/
        word6 = new Word6Extractor(inputStream);
        SummaryInformation si = word6.getSummaryInformation();
        if (si != null) {
            result.addField(ParserFieldEnum.title, si.getTitle());
            result.addField(ParserFieldEnum.author, si.getAuthor());
            result.addField(ParserFieldEnum.subject, si.getSubject());
        }

        String text = word6.getText();
        String[] frags = text.split("\\n");
        for (String frag : frags)
            result.addField(ParserFieldEnum.content, StringUtils.replaceConsecutiveSpaces(frag, " "));
    } finally {
        IOUtils.close(word6);
    }
}