The year 2011 is going to be an interesting one. Not only will PDF/A see the addition of a second part – “PDF/A-2” – to catch up with technological developments; some related standards will also enter the stage.
A new standard – called PDF/UA – will address accessibility of PDF documents. It has been worked on for about five years already and is going to be finalised and published in 2011.
The world of metadata for file formats like PDF will see a very important step as well: Adobe is currently releasing their XMP (Extensible Metadata Platform) specification to ISO, and if all goes well, XMP will be an official ISO standard in 2011.
The PDF/A committee in ISO always saw the need to not only cover archival in the sense of visual reproducibility, but to also capture and preserve as much semantics and content structure as possible. In order to achieve this, the conformance level “A” was introduced (although some say the letter “A” stands for “advanced”, linking the “A” to “accessible” may be more adequate). Level “A” requires that text can be mapped to Unicode and that the semantic structure of the content – for example its reading order – must be reflected in the tags used to structure the PDF. One of the aspects the PDF/A committee was not able (and actually never attempted) to achieve was to offer strict rules or even guidance on how to ensure or enforce good quality tagging. There are also no provisions in PDF/A on how to best take advantage of structure information inside a tagged PDF.
This is exactly what PDF/UA is taking care of. The preparation of the PDF/UA standard has taken quite some time for a reason – it was very important to the PDF/UA committee to get it right as much as possible the first time around. Not only should the use of the PDF/UA lead to better structured PDF, but it should still remain feasible and cost efficient to get there, whether for those producing tagged PDF content, for software vendors developing tools for creation of well structured PDF, or for makers of assistive technology.
The two accessibility presentations in this track will bring you up to speed regarding accessibility in PDF, as well as PDF/A, and beyond. Both speakers – David Hook, Director Product Management at Crawford Technologies and Duff Johnson, CEO, Appligent Document Solutions – have been involved in accessibility for a very long time, and both have actively contributed to the development of PDF/UA.
During the past decade several PDF related standards began to make use of XMP metadata instead of using the simpler and less powerful “Document Information” mechanism, which essentially is a list of key value pairs in PDF syntax. Even PDF itself is moving towards using XMP exclusively for any general purpose metadata. The next version of PDF, to be called PDF 2.0 and to be published as ISO 32000-2 in late 2011 or in 2012, will deprecate the use of the Document Information mechanism.
While some communities like photographers make very active use of XMP for embedding information like copyright notices, keywords or descriptions, metadata in the form of XMP inside PDF files has not yet become as prominent and widely used in the world of document management and archiving. A number of companies who switched from other ways of associating metadata with their documents though found substantial advantages in using XMP. In today’s world, where exchange of structured data using XML-based formats is more widely understood than ten years ago, XMP turns out to be a natural fit for most metadata needs. The metadata sessions track will get organisations started looking into the use of metadata inside PDF and PDF/A, and will illustrate the power and cost efficiency of XMP over other approaches.