Apache PDFBox
Appearance
Developer(s) | Apache Software Foundation |
---|---|
Stable release | 1.8.6
/ June 22, 2014 |
Repository | |
Written in | Java |
Operating system | Cross-platform |
Type | Portable Document Format (PDF) |
License | Apache License 2.0 |
Website | https://pdfbox.apache.org |
Apache PDFBox is a pure-Java library that can be used to create, render, print, split, merge, alter, verify and extract text and meta-data of PDF files.
Structure
Apache PDFBox has these components:
- PDFBox: the main part
- FontBox: handles font information
- JempBox: handles XMP metadata
- Preflight (optional): checks PDF files for PDF/A conformity.
History
PDFBox was started in 2002 in SourceForge by Ben Litchfield who wanted to be able to extract text of PDF files for Lucene. It became an Apache Incubator project in 2008, and an Apache top level project in 2009. [1]
Preflight was originally named PaDaF and developed by Atos worldline, and donated to the project in 2011.[2]