Facebook
Twitter
YOUTUBE
LINKEDIN
XING
About the contributor
Olaf Drümmer

Olaf Drümmer is founder and managing director of callas software, a Berlin/Germany based company specializing in PDF analysis and processing, and of axaio software, developer of software extensions for Adobe Indesign and QuarkXPress. In addition he has been actively involved in the development of PDF standards within ISO since 1999.
More contributions
The only digital document format

Margaret Hamilton led a team credited with developing the software for NASA’s Apollo and Skylab. Her  team was responsible for developing in-flight software, which included algorithms designed by various senior scientists for the Apollo command module …

Save the Date: PDF Days Europe 2018, May 14-16, in Berlin

PDF Days Europe is the most popular PDF event of the year. It’s where the PDF industry meets, and where institutional and corporate users come to learn what else PDF could do for them. The first two PDF Days will offer a broad range of educational sessions focussed on current and perennial topics in the world of PDF technology implementation.

The Power of the Page

It’s a question that vexes vendors of web-based solutions everywhere: why do people still insist on PDF files? And why does PDF’s mindshare keep going up? “PDF is such antediluvian technology!” they say. “It’s pre-web, are you kidding me? It’s so old-f …

PDF Association technical resources: an overview

PDF is PDF because files produced with one vendor’s software can be read using a different vendor’s software with no loss of fidelity. Interoperability is key to our industry. The PDF Association is a international membership organization dedicated to …

2022: The last year of paper for records-keeping

NARA (The National Archives and Records Administration) is the final depository for the long-term records generated by all other agencies of the U.S. Federal Government. The agency has a key role in preserving the cultural history of the republic as we …

Session Intro – Track 3: Metadata – Data about PDF/A Data


When around 2002 the committee that created the PDF/A standard decided to rely on XMP (eXtensible Metadata Platform) for the inclusion of metadata in PDF/A files they probably had no clue which developments it would kick off in the years to come. Being a brand new technology – invented by Adobe and first presented in 2001, being on open specification, publicly available for anyone to use – very few fully understood the implications.

When developers – especially those who work together in the Technical Working Group of the PDF/A Competence Center – realized the impact of these “metadata in PDF/A” provisions, not all of them had a smile on their face. At first glance these provisions appeared to be overly strict and unnecessarily convoluted. Metadata has rarely been fun to deal with, but now for many it seemed to turn from inconvenient to painful.

But don’t be fooled! By now the support for XMP metadata has been implemented in numerous applications, and in many instances offers very elegant ways to review or enter XMP metadata. Quite frequently a user wouldn’t even realize that metadata are stored in XMP format – it just works. And software vendors as well developed a much better understanding of how to best deal with metadata both in the form of XMP as well as in the context of other previously introduced formats like IPTC or EXIF.

Admittedly the matter of metadata is slightly more demanding in PDF/A – and for a good reason: The PDF/A standard manages to make the metadata side of PDF based archiving as reliable as the content side. While this comes at a price, it does not pose unsurmountable problems. Some familiarity with the principles of metadata in general and those of XMP in particular will be a prerequisite though.

The same way everyone in the software industry by now had to “learn” Unicode when dealing with text intelligently, everyone now has to become fluent with metadata in the form of XMP, at least when dealing with PDF, and specifically PDF/A for archiving. And for the same reason we do not want to go back “before Unicode”, we will not want to go back “before XMP” – too much valuable information would be lost, and too much ambiguity would persist.

Of course there is no rose without a thorn. Organisations who have embarked on the journey to “XMP metadata in PDF/A land” have had to learn a lesson or two, though I dare say it has always been worth the effort. At least that’s the conclusion both Jan-Michael Rehn from Bosch and Klaus Baumeister from the European Patent Office come to, when they explain their respective projects of using XMP metadata in PDF/A files.

Those who always thought there is no fun ever to be had with metadata may want to revisit their position after taking in the thoughts of Frank Biederich from Adobe, who provides the introduction to the metadata track.
Guidance towards “XMP Metadata in PDF/A” on a more technical level is offered by Thomas Merz from PDFlib, who has been successful many times at previous occasions to make it easy for non-geeks to understand the essentials of the XMP format and its use in PDF/A.


Tags: 3rd International PDF/A Conference, Proceedings, XMP, metadata
Categories: Government, PDF/A, XMP