tika « index « Java Lucene Q&A

Home
Java Lucene Q&A
1.Database
2.Development
3.document
4.Field
5.index
6.lucene
7.lucene.net
8.nutch
9.query
10.solr
11.Tools
Java Lucene Q&A » index » tika 

1. Indexing PDF with page numbers with Solr    stackoverflow.com

I'm indexing PDFs with Solr using the ExtractingRequestHandler. I would like to display the page number along with hits in a document, e.g. "term foo was found in bar.pdf on pages ...

2. How to parse & index different portions of an HTML page using Tika & Lucene?    stackoverflow.com

I have been trying to parse & index different portions of an HTML page using Lucene & Tika. For eg. I would like to index text within Title, H1, H2, A ...

3. Solr 3.1 doesn't index the file    stackoverflow.com

I have configured Solr 3.1 with Apache tika 0.9 successfully I don't change Schema.xml(default schema) and solrconfig.xml file I have pass this command to browser :

http://localhost:8080/solr/update/extract?literal.id=post1&commit=true%20-F%20%22myfile=@D:\code.txt%22
Output :
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">593</int>
</lst>
</response>
But whenever i search ...

4. Doesn't index or extract the Document (.pdf .doc) from Remotely    stackoverflow.com

I have use Solr 3.1 , Apache Tika 0.9 and Solrnet 0.3.1 to index the docuent like .doc , .pdf file. I have successfully index and extract document on locally using this code

Startup.Init<Article>("http://k9server:8080/solr");
  ...

5. indexing not working for solr and tika integration    stackoverflow.com

Hi I am using curl for indexing the html doc using solr 3.1 curl 'http://localhost:8080/solr1/update/extract?literal.id=tutorial.htm&commit=true' -F "myfile=@tutorial.html" on submitting the request, i am getting this error ERROR:unknown field 'ignored_link'description The request sent by ...

6. XML parser + Indexing data    stackoverflow.com

I need to index some xml documents with Lucene, but before that, i need to parse those XML and extract some info inside their tags. The XML looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<tt ...

7. Solr Index PDF documents and post them to a remote server    stackoverflow.com

Hi I am a naive user when it come to Solr. Please guide me on the following hurdles. 1) Solr Index PDF documents Solution tried I used tika-app 0.9.jar to extract the ...

java2s.com  | Contact Us | Privacy Policy
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.