Olaf Drümmer is founder and managing director of callas software, a Berlin/Germany based company specializing in PDF analysis and processing, and of axaio software, developer of software extensions for Adobe Indesign and QuarkXPress. In addition he has been actively involved in the development of PDF standards within ISO since 1999.
The PDF Techniques Accessibility Summit will identify best-practices in tagging various cases in PDF documents. Questions to be addressed will likely include: the legal ways to tag a nested list, the correct way to caption multiple images, the appropriate way to organize content within headings.
My hospital emailed me a medical records release form as a PDF. They told me to print it, fill it, sign it, scan it and return it to the medical records department, in that order. In 2018? To get the form via email (i.e., electronically), yet be asked to print it? Did the last 20 years just… not mean anything! So I thought I’d be clever. I’d fill it first, THEN print it. Or better yet, never print it, but sign it anyhow, and return it along with a note making the case for improving their workflow. The story continues…
Catherine Andersz of PDFTron Systems, Alaine Behler of iText Software and Peter Wyatt, ISO Project Leader for ISO 32000 enrich the newly elected board of the PDF Association.
Over 80 attendees at PDF/A seminar in Oslo
PDF/A was the topic of the day at the PDF/A seminar in Oslo on April 17, 2012. Organized by Per Arne Flatberg from Palografen, in cooperation with the PDF Association, and with support from the Riksarkivet, it attracted over 80 attendees from all over Norway.
The seminar venue could not have been chosen any better the seminar was held at the main building of the Norwegian National Archives in Oslo, built to preserve all important documents of Norway for eternity.
The seminar agenda
The agenda covered all important aspects of PDF/A and its use in the real world:
PDF/A in law and in the archives Anthony Lærdahl, Digital Format Manager at the National Archives of Norway
From paper to PDF/A Thomas Zellmann, Managing Director, PDF Association
Born Digital: PDF/A as the storage format for digital files Olaf Drümmer, Chairman of the PDF Association and CEO of Callas Software
Get it to work. Principles of PDF/A workflows Panel discussion with Olaf Drümmer, David van Driessche and Thomas Zellmann
Attendees were keen to learn about PDF/A and asked numerous questions after each presentation. Of special interest were how PDF copes with «bit rot», how well PDF/ A goes together with digital signatures, how to archive formulas and macros in spreadsheets, when to validate PDFA files for compliance, and what options exist in the field of engineering to accommodate not only two-dimensional views of CAD drawings but also three dimensional models for interactive inspection:
Some attendees appreciated that TIFF files are not very vulnerable to «bit rot» i.e. in the rare case where for whatever reason just one bit gets corrupt, a TIFF file will still be usable in most cases, as probaby only one pixel will get affected. It was pointed out though that because PDF is an object oriented format, where if one such object gets damaged through a flipped bit, it is usually still possible to reconstruct everything else in the PDF file. No conclusion was reached whether PDF is as robust as TIFF when it comes to «bit rot» or not.
While PDF/A fully supports digital signatures it became obvious that in order to archive digitally signed PDFs as PDF/A it is important to first ensure the file to be signed is already conforming to PDF/A. If it is not which unfortunately is often the case in the real world it is by definition not feasible to convert a digitally signed PDF into PDF/A without breaking the signature. One fallback approach could be to take advantage of PDF/A-3 and to embed the digitally signed PDF inside a PDF/A conforming rendition of the same file.
While archiving spreadsheets from programs Microsoft Excel or OpenOfficeCalc can already be a challenge in itself, as by default each table on export ends up as a separate PDF and as it can be quite cumbersome to take different sizes of the required print areas into account between files and even between tables inside the same spreadsheet file, it is usually not straightforward to also archive information that would usually not appear on a printout but nevertheless plays an important role and is as important for archival as the table content itself: formulas and macros. Three options were discussed: one is to turn all formulas and macros into text, and associate them with the respective part of a the tables as annotations, where the contents field of the annotation contains the formula or macro. Another way could be to include a virtual printout of formulas and macros, and include them as additional pages in the PDF/A file. Last but ot least it might be useful in some organisations to create a PDF/A-3 file from the spreadsheets and to embed the original spreadsheet file with all formulas and macros intact as an associated file. Associated files are a mechanism specific to part 3 of PDF/A which makes it possibel to include non-PDF/A files inside a PDF/A file. It has to be pointed though that the archival quality of arbitrary associated files is undefined, as in this example nobody would know how well a Microsoft Excel 2010 file could be processed 10 or 50 years from now.
An important topic in all PDF/A discussions is the question of validation. In all cases where an archive has no control about how incoming files are prepared it is mandatory that each and every incoming file is validated against the PDF/A standard. Several vendors of the PDF Association offer validation tools, and these vendors have worked together during the last years in order to achieve a high degree of inter-instrument agreement. In other words: in almost all cases these validation tools arrive at the same result when carrying out validation of PDF/A files. It is important though to use the most recent version of these tools. Where organisation have highly standardized processes for creating PDF/A files like it is typical for scanning of paper documents to PDF/A it is usually sufficient to carry out process validation on an ongoing basis, that is each time anything in the process is changed or a component in the scanning solution is updated, as well as validation of samples in regular intervals.
Archiving engineering documents can be covered quite well by PDF/A as long as no 3D models are to be archived. For archiving 3D models and not just selected 2D views of the model users will have to wait for the upcoming PDF/E-2 standard, currently being developed by ISO TC 171 SC 2. Expected for release in late 2013 or early 2014, it will enable engineering users to also archive 3D models encoded either as U3D or PRC. U3D is an ECM standard that is supported by PDF 1.7. PRC is about to be approved as an ISO standard in summer 2012, and will be supported by the upcoming PDF 2.0 standard.
About the National Archives (Riksarkivet)
The National Archives in Norway the Riksarkivet is responsible for keeping records from government agencies, making material available for use, supervise the work of the archives on the state, county and municipal level and make sure contributions to private archives will be preserved. The National Archives safeguard Norway’s collective memory, and continuously receives large amounts of data.
Palografen is a Bergen-based knowledge center that works a lot with PDF formats and PDF-based workflows. Palografen is the first Norwegian organization that has become a member of Association PDF.