Jump to content

Apache PDFBox

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Tilman (talk | contribs) at 19:15, 22 June 2014. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

PDFBox
Developer(s)Apache Software Foundation
Stable release
1.8.6 / June 22, 2014; 10 years ago (2014-06-22)
Repository
Written inJava
Operating systemCross-platform
TypePortable Document Format (PDF)
LicenseApache License 2.0
Websitehttps://pdfbox.apache.org

Apache PDFBox is a pure-Java library that can be used to create, render, print, split, merge, alter, verify and extract text and meta-data of PDF files.

Structure

Apache PDFBox has these components:

  • PDFBox: the main part
  • FontBox: handles font information
  • JempBox: handles XMP metadata
  • Preflight (optional): checks PDF files for PDF/A conformity.

History

PDFBox was started in 2002 in SourceForge by Ben Litchfield who wanted to be able to extract text of PDF files for Lucene. It became an Apache Incubator project in 2008, and an Apache top level project in 2009. [1]

Preflight was originally named PaDaF and developed by Atos worldline, and donated to the project in 2011.[2]

See also

References