Jump to content

Apache PDFBox: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Added {{notability}} and {{third-party}} tags to article (TW)
Line 27: Line 27:


==History==
==History==
PDFBox was started in 2002 in [[SourceForge]] by Ben Litchfield who wanted to be able to extract text of PDF files for [[Lucene]]. It became an [[Apache Incubator]] project in 2008, and an Apache top level project in 2009. <ref>[https://incubator.apache.org/projects/pdfbox.html PDFBox Project Incubation Status]</ref>
PDFBox was started in 2002 in [[SourceForge]] by Ben Litchfield who wanted to be able to extract text of PDF files for [[Lucene]].<ref>[http://www.h-online.com/open/news/item/Apache-PDFBox-and-FontBox-1-0-0-released-932436.html Apache PDFBox and FontBox 1.0.0 released], The H Open, 16 February 2010</ref> It became an [[Apache Incubator]] project in 2008, and an Apache top level project in 2009. <ref>[https://incubator.apache.org/projects/pdfbox.html PDFBox Project Incubation Status]</ref>


Preflight was originally named PaDaF and developed by [[Atos|Atos worldline]], and donated to the project in 2011.<ref>[https://incubator.apache.org/ip-clearance/pdfbox-padaf.html PaDaF Preflight Codebase Intellectual Property (IP) Clearance Status]</ref>
Preflight was originally named PaDaF and developed by [[Atos|Atos worldline]], and donated to the project in 2011.<ref>[https://incubator.apache.org/ip-clearance/pdfbox-padaf.html PaDaF Preflight Codebase Intellectual Property (IP) Clearance Status]</ref>

Revision as of 05:49, 23 June 2014

PDFBox
Developer(s)Apache Software Foundation
Stable release
1.8.6 / June 22, 2014; 10 years ago (2014-06-22)
Repository
Written inJava
Operating systemCross-platform
TypePortable Document Format (PDF)
LicenseApache License 2.0
Websitehttps://pdfbox.apache.org

Apache PDFBox is a pure-Java library that can be used to create, render, print, split, merge, alter, verify and extract text and meta-data of PDF files.

Structure

Apache PDFBox has these components:

  • PDFBox: the main part
  • FontBox: handles font information
  • JempBox: handles XMP metadata
  • Preflight (optional): checks PDF files for PDF/A conformity.

History

PDFBox was started in 2002 in SourceForge by Ben Litchfield who wanted to be able to extract text of PDF files for Lucene.[1] It became an Apache Incubator project in 2008, and an Apache top level project in 2009. [2]

Preflight was originally named PaDaF and developed by Atos worldline, and donated to the project in 2011.[3]

See also

References

External links