PDF/A 101 – Introduction to PDF/A

A new PDF-based standard capable of guaranteeing document management, whether they need to be archived or reproduced in hard copy in large volumes. How do you make your documents compatible with this new standard? 

Using PDF (Portable Document Format)

Right from its launch by Adobe in 1993, PDF has proved to be the most user-friendly format for managing electronic documents of any kind, ranging from accounting documents, maps and designs to books. Features such as easy interchange, the search functions provided, the provision of free viewers (not only Adobe), easy migration and legibility among the various operating systems (Microsoft Windows, Apple, Unix, Linux, Sun etc.) are just some of the many reasons which have resulted in PDF becoming “de facto” the most commonly used standard in the world for exchanging electronic documents.

The majority of document management, ERP and image management systems now have their own internal layout systems with native output in PDF. PDF files are not big, they can be easily archived and read on virtually any system and by everyone.

However, there are some areas of technology which still come under document management where PDF has not become the dominant format yet, for example, in the area of high-volume printing where AFP is still the benchmark format, or else scanning where TIFF format still plays an important role.

On the other hand, as a result of technological progress, most companies which need to manage their own documents are becoming independent when it comes to creating PDF documents. This also makes them independent in terms of managing these documents, whether this involves long-term archiving or sending the documents electronically to the recipients.

This is also the trend for the future. It is not for nothing that HW vendors have been adapting to this “de facto” standard for a few years now. This has led to the appearance of high-volume printers with management software capable of handling PDF as native input files as well, in addition to AFP format, along with high-volume scanners which generate PDF files directly, plus other developments.

Why use PDF/A?

Can we say absolutely that PDF is the right document management standard to meet every need? The reality is not as simple as this. PDF offers a host of benefits, but is not, unfortunately, a standard. Anyone who has had the chance to work with PDF is aware that PDF itself does not guarantee that a document in this format can be displayed, archived long term or printed in the correct form.

Proper character management, proper colour profile management or the document’s actual dimensions are variables which are difficult to control exclusively via PDF. The actual development of PDF’s capabilities (features for managing layers, encapsulating images, film clips etc.) means that a file’s compatibility and ability to be displayed properly over time are not guaranteed.

In the particular case of high-volume printing (which is still a fundamental aspect of the document management process nowadays), problems caused by not managing some of the characters or images properly or relating to the management of the actual colour mean that the way in which the file is displayed sometimes does not match the original in the way the person who created the document intended. Original fonts are also substituted by others. These are just some of the problems which people managing PDF have ultimately had to resolve.

Documents sent to print where some characters disappear, letterheads which are superimposed because the viewer has substituted the character when displayed on screen, or else documents with abnormal dimensions or even actual porting from Windows to Mac or else Linux, AS400 etc., where character management is still an issue which needs to be dealt with today. These are a few of the problems with using PDF which are still unresolved at the moment.

But what is PDF/A?

PDF/A attempts to provide a solution to all these problems which can be summarised in a single statement: the certainty of reproducing the document correctly, irrespective of the HW and SW used. Consequently, the international community created an ISO specification, which is nothing more than a PDF subset. Created in 2002, PDF/A stipulates the basic rules for having a standard which can guarantee long-term archiving for documents.

The PDF/A initiative was kicked off in 2002 by AIIM (Association for Information and Image Management), NPES (National Printing Equipment Association) and the Administrative Office of the U.S. Courts. By 2005, PDF/A had been published as ISO 19005-1, where it is the cornerstone standard for electronic document file format for long-term archive and preservation. Today, AIIM provides the lead on the PDF/A ISO Standard and the PDF/A Competence Center is the major industry association supporting PDF/A, especially in Europe where adoption rates are higher than in North America. With all this in mind, it is easy to understand why the PDF/A standard is rapidly being required by governments and implemented by industries around the world.

The PDF/A standard is “a file format based on PDF which provides a mechanism for representing electronic documents in a manner that preserves their visual appearance over time, independent of the tools and systems used for creating, storing or rending the files”.

The current PDF/A specification, PDF/A-1, is based on the PDF 1.4 specifications and has two levels. Adoption of the first level (PDF/A-1a) ensures the preservation of a document’s logical structure and content text stream in natural reading order. This is critical when the document is displayed on a mobile device (for example a PDA) or other devices. This feature is commonly known as “Tagged PDFs”. Some PDFs are created with sufficient information to meet this requirement; many PDFs created by production business processes do not contain this information and so fit into the second level.

The second level of compliance is referred to as PDF/A-1b. This level is the minimal standard that ensures the rendered visual appearance of the file is reproducible over the long-term. Specifically, PDF/A-1b ensures that the text (and additional content) can be correctly displayed (e.g. on a computer monitor or in hardcopy), but does not guarantee that extracted text will maintain the same structure as presented in the original document.

But using this standard, created for long-term archiving, may provide the standard to be used as the basis for all delivery or reproduction requirements, including high-volume printing. All this obviously leads to the request from the companies using this technology for a guarantee that their documents can be reproduced correctly, thereby providing a standard which eliminates all the issues mentioned above. In other words, a standard allowing documents to be printed, archived and displayed in the same way every time, whether now or in 100 years’ time, without having to deal with the problem of medium type (paper, DVD etc.), viewer or type of computer which will be used to display them (Windows, Mac, Linux etc.).

How to become PDF/A compliant

Most companies which use PDF nowadays do not generate PDF/A files and many are unaware of the aims and obligations associated with using an ISO standard format like PDF/A. However, processes and software are becoming available now on the market which can verify and convert a PDF file into a PDF/A file. This provides the guarantee for companies generating these files that their own documents will not be subject to variation over time and will be able to be printed or displayed in the same way every time, regardless of the computer or printer required to reproduce it.

There are a variety of software packages on the market for converting PDF to PDF/A, but none of them can guarantee 100% at the moment that they can resolve all the PDF document management issues, particularly the management of fonts and colour profiles, especially when the document is generated on different platforms to those to be used for the conversion process. Nevertheless, this is the way to go. The aim is to come up with a universally recognised standard which makes vendors, software houses, service providers etc. follow the same rules, thereby providing a guarantee both for anyone creating only electronic documents and for anyone who has to print these documents.

There is still a long way to go, but the guidelines have now been set out. There is definitely still a great deal to be done in providing information about the benefits of such standardisation. The new standard needs to be promoted at every level, among those involved in this area, as well as to both public and private companies.

About Raffaele Bernardinello

ICT Director, C.M. Trading S.r.l.

Leave a Reply