Apache PDFBox
The topic of this article may not meet Wikipedia's notability guidelines for products and services. (June 2014) |
This article may rely excessively on sources too closely associated with the subject, potentially preventing the article from being verifiable and neutral. (June 2014) |
Developer(s) | Apache Software Foundation |
---|---|
Stable release | 1.8.6
/ June 22, 2014 |
Repository | |
Written in | Java |
Operating system | Cross-platform |
Type | Portable Document Format (PDF) |
License | Apache License 2.0 |
Website | https://pdfbox.apache.org |
Apache PDFBox is an open source pure-Java library that can be used to create, render, print, split, merge, alter, verify and extract text and meta-data of PDF files.
Ohloh reports almost 2,000 commits (since the start as an Apache project) by 17 contributors representing more than 100,000 lines of code. PDFBox has a well established, mature codebase maintained by a large development team with increasing Y-O-Y comments. [1]
Using the COCOMO model, it took an estimated 30 person-years of effort.[2]
Structure
Apache PDFBox has these components:
- PDFBox: the main part
- FontBox: handles font information
- JempBox: handles XMP metadata
- Preflight (optional): checks PDF files for PDF/A conformity.
History
PDFBox was started in 2002 in SourceForge by Ben Litchfield who wanted to be able to extract text of PDF files for Lucene.[3] It became an Apache Incubator project in 2008, and an Apache top level project in 2009. [4]
Preflight was originally named PaDaF and developed by Atos worldline, and donated to the project in 2011.[5]
See also
References
- ^ (2014-06-25). "The Apache PDFBox Open Source Project on Ohloh". Ohloh.net. Retrieved 2014-06-25.
{{cite web}}
: CS1 maint: extra punctuation (link) - ^ (2014-06-25). "Ohloh Estimated development cost". Ohloh.net. Retrieved 2014-06-25.
{{cite web}}
: CS1 maint: extra punctuation (link) - ^ Apache PDFBox and FontBox 1.0.0 released, The H Open, 16 February 2010
- ^ PDFBox Project Incubation Status
- ^ PaDaF Preflight Codebase Intellectual Property (IP) Clearance Status