Olaf Drümmer is founder and managing director of callas software, a Berlin/Germany based company specializing in PDF analysis and processing, and of axaio software, developer of software extensions for Adobe Indesign and QuarkXPress. In addition he has been actively involved in the development of PDF standards within ISO since 1999.
The PDF Association started in 2006 as the “PDF/A Competence Center”. The mission was to identify – and thereby establish – a common interpretation of the PDF/A-1 specification. With that accomplished through meetings open to all members, the secondary …
PDF files deliver a complete package of information that defines a document; everything that’s needed to represent the text, graphics and layout that the recipient receives. To most people, PDF is “electronic paper” – the digital expression of a cellul …
What is a “document”? A document is a record of some (typically written) content – a publication, a contract, a statement, a painting – at a moment in time. Until the advent of computers (and scanners), the media typically considered useable for such r …
PDF Days Europe is the most popular PDF event of the year. It’s where the PDF industry meets, and where institutional and corporate users come to learn what else PDF could do for them. The first two PDF Days will offer a broad range of educational sessions focussed on current and perennial topics in the world of PDF technology implementation.
It’s a question that vexes vendors of web-based solutions everywhere: why do people still insist on PDF files? And why does PDF’s mindshare keep going up? “PDF is such antediluvian technology!” they say. “It’s pre-web, are you kidding me? It’s so old-f …
Over 80 attendees at PDF/A seminar in Oslo
PDF/A was the topic of the day at the PDF/A seminar in Oslo on April 17, 2012. Organized by Per Arne Flatberg from Palografen, in cooperation with the PDF Association, and with support from the Riksarkivet, it attracted over 80 attendees from all over Norway.
The seminar venue could not have been chosen any better the seminar was held at the main building of the Norwegian National Archives in Oslo, built to preserve all important documents of Norway for eternity.
The seminar agenda
The agenda covered all important aspects of PDF/A and its use in the real world:
PDF/A in law and in the archives Anthony Lærdahl, Digital Format Manager at the National Archives of Norway
From paper to PDF/A Thomas Zellmann, Managing Director, PDF Association
Born Digital: PDF/A as the storage format for digital files Olaf Drümmer, Chairman of the PDF Association and CEO of Callas Software
Get it to work. Principles of PDF/A workflows Panel discussion with Olaf Drümmer, David van Driessche and Thomas Zellmann
Attendees were keen to learn about PDF/A and asked numerous questions after each presentation. Of special interest were how PDF copes with «bit rot», how well PDF/ A goes together with digital signatures, how to archive formulas and macros in spreadsheets, when to validate PDFA files for compliance, and what options exist in the field of engineering to accommodate not only two-dimensional views of CAD drawings but also three dimensional models for interactive inspection:
Some attendees appreciated that TIFF files are not very vulnerable to «bit rot» i.e. in the rare case where for whatever reason just one bit gets corrupt, a TIFF file will still be usable in most cases, as probaby only one pixel will get affected. It was pointed out though that because PDF is an object oriented format, where if one such object gets damaged through a flipped bit, it is usually still possible to reconstruct everything else in the PDF file. No conclusion was reached whether PDF is as robust as TIFF when it comes to «bit rot» or not.
While PDF/A fully supports digital signatures it became obvious that in order to archive digitally signed PDFs as PDF/A it is important to first ensure the file to be signed is already conforming to PDF/A. If it is not which unfortunately is often the case in the real world it is by definition not feasible to convert a digitally signed PDF into PDF/A without breaking the signature. One fallback approach could be to take advantage of PDF/A-3 and to embed the digitally signed PDF inside a PDF/A conforming rendition of the same file.
While archiving spreadsheets from programs Microsoft Excel or OpenOfficeCalc can already be a challenge in itself, as by default each table on export ends up as a separate PDF and as it can be quite cumbersome to take different sizes of the required print areas into account between files and even between tables inside the same spreadsheet file, it is usually not straightforward to also archive information that would usually not appear on a printout but nevertheless plays an important role and is as important for archival as the table content itself: formulas and macros. Three options were discussed: one is to turn all formulas and macros into text, and associate them with the respective part of a the tables as annotations, where the contents field of the annotation contains the formula or macro. Another way could be to include a virtual printout of formulas and macros, and include them as additional pages in the PDF/A file. Last but ot least it might be useful in some organisations to create a PDF/A-3 file from the spreadsheets and to embed the original spreadsheet file with all formulas and macros intact as an associated file. Associated files are a mechanism specific to part 3 of PDF/A which makes it possibel to include non-PDF/A files inside a PDF/A file. It has to be pointed though that the archival quality of arbitrary associated files is undefined, as in this example nobody would know how well a Microsoft Excel 2010 file could be processed 10 or 50 years from now.
An important topic in all PDF/A discussions is the question of validation. In all cases where an archive has no control about how incoming files are prepared it is mandatory that each and every incoming file is validated against the PDF/A standard. Several vendors of the PDF Association offer validation tools, and these vendors have worked together during the last years in order to achieve a high degree of inter-instrument agreement. In other words: in almost all cases these validation tools arrive at the same result when carrying out validation of PDF/A files. It is important though to use the most recent version of these tools. Where organisation have highly standardized processes for creating PDF/A files like it is typical for scanning of paper documents to PDF/A it is usually sufficient to carry out process validation on an ongoing basis, that is each time anything in the process is changed or a component in the scanning solution is updated, as well as validation of samples in regular intervals.
Archiving engineering documents can be covered quite well by PDF/A as long as no 3D models are to be archived. For archiving 3D models and not just selected 2D views of the model users will have to wait for the upcoming PDF/E-2 standard, currently being developed by ISO TC 171 SC 2. Expected for release in late 2013 or early 2014, it will enable engineering users to also archive 3D models encoded either as U3D or PRC. U3D is an ECM standard that is supported by PDF 1.7. PRC is about to be approved as an ISO standard in summer 2012, and will be supported by the upcoming PDF 2.0 standard.
About the National Archives (Riksarkivet)
The National Archives in Norway the Riksarkivet is responsible for keeping records from government agencies, making material available for use, supervise the work of the archives on the state, county and municipal level and make sure contributions to private archives will be preserved. The National Archives safeguard Norway’s collective memory, and continuously receives large amounts of data.
Palografen is a Bergen-based knowledge center that works a lot with PDF formats and PDF-based workflows. Palografen is the first Norwegian organization that has become a member of Association PDF.