Jump to content

PDF/A

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 134.99.112.66 (talk) at 13:55, 18 October 2011 (External links). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

PDF/A
Filename extension
.pdf
Type code'PDF ' (including a single space)
Magic number%PDF
Developed byISO
Initial release2005 (2005)
Extended fromPDF
StandardISO 19005-1:2005[1]
ISO 19005-2:2011[2]

PDF/A is an ISO-standardized version of the Portable Document Format (PDF) specialized for the digital preservation of electronic documents.

PDF/A differs from PDF by omitting features ill-suited to long-term archiving, such as font linking (as opposed to font embedding). (Similarly, the PDF/X file format is specially adapted to digital printing and graphic arts.)

The ISO requirements for PDF/A file viewers include color management guidelines, support for embedded fonts, and a user interface for reading embedded annotations.

PDF/A-1 is based on the PDF Reference Version 1.4 from Adobe Systems Inc. (implemented in Adobe Acrobat 5 and latest versions) and is defined by ISO 19005-1:2005, an ISO Standard that was published on October 1, 2005: Document Management - Electronic document file format for long term preservation - Part 1: Use of PDF 1.4 (PDF/A-1)[1]

PDF/A-2 is based on ISO 32000-1 - PDF 1.7 and is defined by ISO 19005-2:2011, published on June 20, 2011 under the formal name Document management -- Electronic document file format for long-term preservation -- Part 2: Use of ISO 32000-1 (PDF/A-2).[2] PDF/A-2 is a very recent standard and is not widely used.

Description

The Standard does not define an archiving strategy or the goals of an archiving system. It identifies a "profile" for electronic documents that ensures the documents can be reproduced exactly the same way in years to come. A key element to this reproducibility is the requirement for PDF/A documents to be 100% self-contained. All of the information necessary for displaying the document in the same manner every time is embedded in the file. This includes, but is not limited to, all content (text, raster images and vector graphics), fonts, and color information. A PDF/A document is not permitted to be reliant on information from external sources (e.g. font programs and hyperlinks).

Other key elements to PDF/A compatibility include:[3][4][5]

  • Audio and video content are forbidden.
  • JavaScript and executable file launches are forbidden.
  • All fonts must be embedded and also must be legally embeddable for unlimited, universal rendering. This also applies to the so-called PostScript standard fonts such as Times or Helvetica.
  • Colorspaces specified in a device-independent manner.
  • Encryption is forbidden.
  • Use of standards-based metadata is mandated.
  • External content references are forbidden.
  • LZW and JPEG2000 image compressions are forbidden in PDF/A-1, but JPEG 2000 compression is allowed in PDF/A-2.
  • Transparent objects and layers (Optional Content Groups) are forbidden in PDF/A-1, but they are supported in PDF/A-2.
  • Provisions for digital signatures in accordance with the PAdES (PDF Advanced Electronic Signatures) standard are supported in PDF/A-2.
  • Embedded files are forbidden in PDF/A-1, but PDF/A-2 offers the possibility to embed PDF/A files, allowing archiving of sets of documents in a single file.

Conformance levels and versions

PDF/A-1

The standard specifies two levels of compliance for PDF files:

  • PDF/A-1a - Level A compliance in Part 1
  • PDF/A-1b - Level B compliance in Part 1

PDF/A-1b has the objective of ensuring reliable reproduction of the visual appearance of the document. PDF/A-1a includes all the requirements of PDF/A-1b and additionally requires that document structure be included (also known as being "tagged"/"Tagged PDF"), with the objective of ensuring that document content can be searched and repurposed. PDF/A-1a also requires Unicode character maps.

The requirements for Level A conformance place greater responsibilities on writers preparing conforming files, but these requirements allow for a higher level of document preservation service and confidence over time. Level A conformance also facilitates the accessibility of conforming files for physically impaired users.

According to the specification, the following terms are recommended when referring to the ISO 19005-1:2005 specification when the full ISO name is not being used:

  • PDF/A – a synonym for the ISO 19005 family of standards
  • PDF/A-1 – a synonym for ISO 19005-1
  • PDF/A-1a – a synonym for ISO 19005-1 Level A conformance
  • PDF/A-1b – a synonym for ISO 19005-1 Level B conformance

PDF/A-2

PDF/A-2 is the second part to the standard. PDF/A-2 address some of the new features added with versions 1.5, 1.6 and 1.7 of the PDF Reference. PDF/A-2 should be backwards compatible, i.e. all valid PDF/A-1 documents should also be compliant with PDF/A-2. However PDF/A-2 compliant files will not necessarily be PDF/A-1 compliant.

Part 2 of the PDF/A Standard is based on a more recent version, PDF 1.7 (ISO 32000-1), rather than PDF 1.4 and offers a number of new features: JPEG2000 image compression, support for transparency effects and layers, embedding of OpenType fonts, provisions for digital signatures in accordance with the PDF Advanced Electronic Signatures - PAdES standard, possibility to embed PDF/A files in PDF/A-2 for archiving of sets of documents as individual documents in a single file.[4]

Identification

A PDF/A document can be identified as such through PDF/A-specific metadata located in the "http://www.aiim.org/pdfa/ns/id/" namespace. However, claiming to be PDF/A and being so are not necessarily the same

  • A PDF document can be PDF/A-compliant, except for its lack of PDF/A metadata. This may happen for instance with documents that were generated before the definition of the PDF/A standard, by authors aware of features that present long-term preservation issues.
  • A PDF document can be identified as PDF/A, but may incorrectly contain PDF features not allowed in PDF/A; hence, documents which claim to be PDF/A-compliant should be tested for PDF/A compliance.

Drawbacks

As a PDF/A document must embed all fonts that it uses, a PDF/A file will often be bigger than an equivalent PDF file that does not have the fonts embedded. This may be undesirable when archiving large numbers of small files that all use the same fonts, since a separate copy of each font will be embedded in each file.

The use of transparency is forbidden in PDF/A-1. The majority of PDF generation tools that allow for PDF/A document compliance, such as the PDF export in OpenOffice.org or PDF export tool in Microsoft Office 2007 suites, will also make any transparent images in a given document non-transparent. That restriction was removed in PDF/A-2.[3]

Background

PDF/A was originally a new joint activity between NPES - The Association for Suppliers of Printing, Publishing and Converting Technologies, and the Association for Information and Image Management to develop an International standard that defines the use of the Portable Document Format (PDF) for archiving and preserving documents. The goal was to address the growing need to electronically archive documents in a way that will ensure preservation of their contents over an extended period of time, and will further ensure that those documents will be able to be retrieved and rendered with a consistent and predictable result in the future. This need exists in a growing number of international government and industry segments, including legal systems, libraries, newspapers, regulated industries, and others.

See also

References

  1. ^ a b ISO (2005). "ISO 19005-1:2005 - Document management -- Electronic document file format for long-term preservation -- Part 1: Use of PDF 1.4 (PDF/A-1)". Retrieved 2011-07-06.
  2. ^ a b ISO (2011-06-20). "ISO 19005-2:2011 - Document management -- Electronic document file format for long-term preservation -- Part 2: Use of ISO 32000-1 (PDF/A-2)". Retrieved 2011-07-06.
  3. ^ a b "PDF/A – A Look at the Technical Side". Retrieved 2011-07-06.
  4. ^ a b "PDF/A-2 Standard Published by ISO! The New Standard Includes Great Technical Enhancements". 2011-07-01. Retrieved 2011-07-06.
  5. ^ Frequently Asked Questions (FAQs) - ISO 19005-1:2005 - PDF/A-1, Date: July 10, 2006 (PDF), 2006-07-10, retrieved 2011-07-06