Jump to content

Apache PDFBox: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
it's not clear what Y-O-Y means
update version, links; PDFA partnership
Line 5: Line 5:
| collapsible = yes
| collapsible = yes
| developer = [[Apache Software Foundation]]
| developer = [[Apache Software Foundation]]
| latest release version = 1.8.8
| latest release version = 1.8.9
| latest release date = {{release date and age|2014|12|13}}
| latest release date = {{release date and age|2015|03|28}}
| latest preview version =
| latest preview version =
| latest preview date =
| latest preview date =
Line 19: Line 19:
'''Apache PDFBox''' is an open source pure-[[Java (software platform)|Java]] library that can be used to create, render, print, split, merge, alter, verify and extract text and meta-data of [[PDF]] files.
'''Apache PDFBox''' is an open source pure-[[Java (software platform)|Java]] library that can be used to create, render, print, split, merge, alter, verify and extract text and meta-data of [[PDF]] files.


[[Ohloh]] reports over 2,000 commits (since the start as an Apache project) by 17 contributors representing more than 100,000 lines of code. PDFBox has a well established, mature codebase maintained by an average size development team with increasing [[Year Over Year|Y-O-Y]] commits. <ref>{{cite web|author=&nbsp;|url=https://www.ohloh.net/p/pdfbox/ |title=The Apache PDFBox Open Source Project on Ohloh |publisher=Ohloh.net |date=2014-06-25 |accessdate=2014-06-25}}</ref>
[[Open Hub]] reports over 3,000 commits (since the start as an Apache project) by 17 contributors representing more than 100,000 lines of code. PDFBox has a well established, mature codebase maintained by an average size development team with increasing [[Year Over Year|Y-O-Y]] commits. Using the [[COCOMO]] model, it took an estimated 31 [[person-year]]s of effort. <ref>{{cite web|author=&nbsp;|url=https://www.openhub.net/p/pdfbox/ |title=The Apache PDFBox Open Source Project on Open Hub |publisher=openhub.net |date=2015-05-13 |accessdate=2015-05-13}}</ref>

Using the [[COCOMO]] model, it took an estimated 30 [[person-year]]s of effort.<ref>{{cite web|author=&nbsp; |url=https://www.ohloh.net/p/pdfbox/estimated_cost |title=Ohloh Estimated development cost |publisher=Ohloh.net |date=2014-06-25 |accessdate=2014-06-25}}</ref>


==Structure==
==Structure==
Line 28: Line 26:
* FontBox: handles font information
* FontBox: handles font information
* JempBox: handles [[Extensible Metadata Platform|XMP metadata]]
* JempBox: handles [[Extensible Metadata Platform|XMP metadata]]
* Preflight (optional): checks PDF files for PDF/A conformity.
* Preflight (optional): checks PDF files for [[PDF/A]]-1b conformity.


==History==
==History==
Line 34: Line 32:


Preflight was originally named PaDaF and developed by [[Atos|Atos worldline]], and donated to the project in 2011.<ref>[https://incubator.apache.org/ip-clearance/pdfbox-padaf.html PaDaF Preflight Codebase Intellectual Property (IP) Clearance Status]</ref>
Preflight was originally named PaDaF and developed by [[Atos|Atos worldline]], and donated to the project in 2011.<ref>[https://incubator.apache.org/ip-clearance/pdfbox-padaf.html PaDaF Preflight Codebase Intellectual Property (IP) Clearance Status]</ref>

In February 2015, Apache PDFBox was named an Open Source Partner Organization of the [[PDF Association]]. <ref>[http://www.pdfa.org/news/apache-pdfbox-named-an-open-source-partner-organization-of-the-pdf-association/ Apache™ PDFBox™ named an Open Source Partner Organization of the PDF Association], February 3, 2015</ref>


== See also ==
== See also ==

Revision as of 16:39, 13 May 2015

PDFBox
Developer(s)Apache Software Foundation
Stable release
1.8.9 / March 28, 2015; 9 years ago (2015-03-28)
Repository
Written inJava
Operating systemCross-platform
TypePortable Document Format (PDF)
LicenseApache License 2.0
Websitehttps://pdfbox.apache.org

Apache PDFBox is an open source pure-Java library that can be used to create, render, print, split, merge, alter, verify and extract text and meta-data of PDF files.

Open Hub reports over 3,000 commits (since the start as an Apache project) by 17 contributors representing more than 100,000 lines of code. PDFBox has a well established, mature codebase maintained by an average size development team with increasing Y-O-Y commits. Using the COCOMO model, it took an estimated 31 person-years of effort. [1]

Structure

Apache PDFBox has these components:

  • PDFBox: the main part
  • FontBox: handles font information
  • JempBox: handles XMP metadata
  • Preflight (optional): checks PDF files for PDF/A-1b conformity.

History

PDFBox was started in 2002 in SourceForge by Ben Litchfield who wanted to be able to extract text of PDF files for Lucene.[2] It became an Apache Incubator project in 2008, and an Apache top level project in 2009. [3]

Preflight was originally named PaDaF and developed by Atos worldline, and donated to the project in 2011.[4]

In February 2015, Apache PDFBox was named an Open Source Partner Organization of the PDF Association. [5]

See also

References

  1. ^   (2015-05-13). "The Apache PDFBox Open Source Project on Open Hub". openhub.net. Retrieved 2015-05-13.{{cite web}}: CS1 maint: extra punctuation (link)
  2. ^ Apache PDFBox and FontBox 1.0.0 released, The H Open, 16 February 2010
  3. ^ PDFBox Project Incubation Status
  4. ^ PaDaF Preflight Codebase Intellectual Property (IP) Clearance Status
  5. ^ Apache™ PDFBox™ named an Open Source Partner Organization of the PDF Association, February 3, 2015