官网下载 https://pdfbox.apache.org/download.html
下载 pdfbox-app-3.0.3.jar
cd D:\pdfbox
运行 java -jar pdfbox-app-3.0.3.jar
java -jar pdfbox-app-3.0.3.jar
Usage: pdfbox [COMMAND] [OPTIONS]
Commands:debug Analyzes and inspects the internal structure of a PDF documentdecrypt Decrypts a PDF documentencrypt Encrypts a PDF documentdecode Writes a PDF document with all streams decodedexport:images Extracts the images from a PDF documentexport:xmp Extracts the xmp stream from a PDF documentexport:text Extracts the text from a PDF documentexport:fdf Exports AcroForm form data to FDFexport:xfdf Exports AcroForm form data to XFDFimport:fdf Imports AcroForm form data from FDFimport:xfdf Imports AcroForm form data from XFDFoverlay Adds an overlay to a PDF documentprint Prints a PDF documentrender Converts a PDF document to image(s)merge Merges multiple PDF d*.ocuments into onesplit Splits a PDF document into number of new documentsfromimage Creates a PDF document from imagesfromtext Creates a PDF document from textversion Gets the version of PDFBoxhelp Display help information about the specified command.
See 'pdfbox help <command>' to read about a specific subcommand
运行 java -jar pdfbox-app-3.0.3.jar debug
# 导出扫描版PDF文件中每页的图片文件
java -jar pdfbox-app-3.0.3.jar export:images -prefix=test -i your_book.pdf
Writing image: test-1.jpg
Writing image: test-2.jpg
Writing image: test-3.png
# from 多个 image 合并生成 pdf
java -jar pdfbox-app-3.0.3.jar fromimage -o=book1.pdf -i=test-1.jpg -i=test-2.jpg -i=test-3.png -i=test-4.jpg
生成 book1.pdf 视觉效果太差,而且命令行长度限制了图片文件数(一般扫描书都有几百页)。