java pdf转txt用于文档全文检索

网友投稿 556 2022-05-29

待处理

https://pdfbox.apache.org/

https://stackoverflow.com/questions/18098400/how-to-get-raw-text-from-pdf-file-using-Java

https://stackoverflow.com/questions/50692771/multiple-pdf-file-to-txt-in-java

java pdf转txt用于文档全文检索

https://stackoverflow.com/questions/30570196/how-to-convert-pdf-into-text-file-using-itext-liberary

https://stackoverflow.com/questions/23813727/how-to-extract-text-from-a-pdf-file-with-apache-pdfbox

https://stackoverflow.com/questions/583615/pdf-to-text-tool-or-java-library

https://stackoverflow.com/questions/17986305/how-can-i-convert-pdf-file-to-word-file-using-java

lucene 全文检索

https://www.toptal.com/database/full-text-search-of-dialogues-with-apache-lucene(https://github.com/dougsparling/lucene-testbed)

https://stackoverflow.com/questions/6807701/lucene-full-text-search

https://medium.com/@wkrzywiec/full-text-search-with-hibernate-search-lucene-part-1-e245b889aa8e

(https://github.com/wkrzywiec/Library-Spring/tree/163fbbac65750b199cc665a2ba61fd4b80fc2ff6)

https://blog.csdn.net/forfuture1978/article/details/4711308

https://blog.csdn.net/yerenyuan_pku/article/details/72582979

https://blog.csdn.net/u014704496/article/details/40408387

https://www.baeldung.com/lucene-file-search(https://github.com/eugenp/tutorials/tree/master/lucene)

https://github.com/tantivy-search/tantivy

https://www.wave-access.com/public_en/blog/2014/october/02/full-text-search-by-using-apache-lucene.aspx

分解出pdf中的目录:

https://pdfbox.apache.org/docs/2.0.2/javadocs/org/apache/pdfbox/pdmodel/PDDocument.html

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:快递100快递实时快递查询接口API产品文档
下一篇:校园管理系统需求分析文档
相关文章