The technical side of the PDF/A standard

PDF/A in a Nutshell 2.0 – PDF for long-term archiving

After the first part of PDF/A was published, two more parts arrived. These are not replacements for part 1, however; rather, they offer additional options for archiving PDF documents. All existing PDF/A files remain fully valid.

PDF/A-1: The first archiving standard

PDF/A-1 is based on PDF version 1.4, which first appeared in 2001. All resources (images, graphics, typographic characters) must be embedded within the PDF/A document itself. A PDF/A file requires precise, platform-independent colour data using ICC profiles, and XMP for the document metadata. Transparent elements, some forms of compression (LZW, JPEG2000), PDF layers, and certain actions or JavaScript are forbidden. A PDF/A file must not be password-protected. PDF/A-1 expressly supports embedded digital signatures and the use of hyperlinks.

PDF/A-2: Based on PDF 1.7

PDF/A-2 was published in 2011 as “ISO 19005-2”. Based on PDF version 1.7, which has since been standardised as “ISO 32000-1”, it makes use of this version’s new features. This means PDF/A-2 allows JPEG2000 compression, transparent elements and PDF layers. PDF/A-2 also allows you to embed OpenType fonts and supports PAdES (PDF Advanced Electronic Signatures)-compliant digital signatures. One particularly important innovation is the “container” function: PDF/A files can be embedded within a PDF/A-2 document.

PDF/A-3: One more feature

PDF/A-3 has been available since October 2012. A PDF/A-3 document allows you to embed any file format desired – not just PDF/A documents. For example, a PDF/A-3 file can contain the original file from which it was generated. The PDF/A standard does not regulate the suitability of these embedded files for archiving.

Conformance levels: A, B, U

The different conformance levels reflect the quality of the archived document and depend on the input material and the document’s purpose.

  • Level A (Accessible) meets all requirements for the standard, including the logical structure of the document and its correct reading order. Text must be extractable and the logical structure must match the natural reading order. Fonts used must meet stringent requirements. This PDF/A level can usually only be met by converting born-digital documents.
  • Level B (Basic) guarantees that the content of the document can be unambiguously reproduced. Level B files are easier to create than Level A, but Level B does not guarantee 100% text extraction or searchability. It does not necessarily mean that the content can be reused without any problems. Scanned paper documents can usually be converted to PDF/A Conformance Level B without any extra work.
  • Level U (Unicode) was introduced along with PDF/A-2. It expands Conformance Level B to specify that all text can be mapped to standard Unicode character codes.

Nomenclature: PDF/A versions and levels are simply given one after another. A PDF/A-1b file, for example, is a PDF file for long-term archiving, of the first generation, with visually reproducible content.

< previousoverviewnext >

About Alexandra Oettler

Alexandra Oettler ist (Co-) Autorin der Bücher PDF/A kompakt und PDF/A kompakt 2.0.

Leave a Reply