The Portable Document Format turns 30 in 2023, as some have already noticed.
In a world dominated by web technology, users continue to agree that PDF offers essential capabilities such that PDF continues to trend upwards in Google searches, even as the original specification is older than many of its end users…. or implementers.
As PDF technology approaches its fourth decade PDF Association membership is at historic highs. This has happened even as the technology's ecosystem has experienced substantial consolidation, with companies (e.g. Allyant, Foxit, Kofax, PDFTron, PSPDFKit and Smallpdf) acquiring a wide variety of PDF companies and products, with more acquisitions on the horizon.
Some of these changes represent natural turn-over as entrepreneurs leave the companies that they founded decades ago for new projects… or no projects. A larger factor is private equity, which has noticed that PDF plays a vital role in the global economy and believes that PDF best days are ahead.
They are right about that.
Over the past year PDF’s industry association was busier than ever, adding new technical working groups, features and projects to meet the requests of developers and other stakeholders seeking to understand and leverage standardized PDF technology. Notable projects in 2022 included the following:
The PDF Association’s technical and liaison working groups have previously developed and published test suites, specifications, best practice guides and more going back to 2006. Over the years, in response to members’ requests, the organization has devoted increasingly more resources towards meeting stakeholders’ technical needs, and to providing a forum for developers to debate the future of PDF and its subsets, extensions, and other means of delivering on marketplace requirements.
In November the Board of Directors decided that the PDF Association will embark on a new phase in its evolution to better meet industry’s technical needs by “delivering a vendor-neutral platform for developing open specifications and standards for PDF technology”. This is, in fact, the organization’s new mission statement.
2006: The PDF/A Competence Center is launched in Germany with the mission of promoting shared understanding and adoption of PDF/A (ISO 19005).
2010: The organization broadened its scope to include all ISO standards for PDF technology and integrated a new system of technical committees.
2019: The PDF Association acquired the ISO standards program for TC 171 SC 2 from the 3D PDF Consortium.
2023: We will increase our focus on supporting PDF’s technical community, including as a standards development organization (SDO).
The Board of Directors has chosen to express this new focus with a new vision statement:
"Driving the world’s digital document format into the 21st century and beyond."
The next phase in the PDF Association's evolution will tell that story.
First and foremost, the PDF Association has always been about industry collaboration to support PDFs stakeholders, from developers and software publishers to corporations, government agencies and individual users. Starting in 2023 the PDF Association will revise its in-person meeting model to focus on technical value. The new format - PDF Week - will integrate in-person Technical and Liaison Working Groups (TWGs and LWGs) with meetings of ISO TC 171 SC 2, in which PDF Association members and other attendees may participate.
We’ll be expanding our utilization of GitHub to enhance the scope and enhance the responsiveness of industry collaboration across a wide range of PDF-related interest areas, from specifications to test-suites.
Much more information about these new initiatives will be forthcoming in the weeks and months ahead. It’s an exciting time for this globally-significant technology.
Back in 2019 the Board considered that DARPA's SafeDocs fundamental research program was both an opportunity and a potential source of problems. The program might achieve outcomes of value to some PDF stakeholders… or it might cause issues for the PDF industry through a lack of understanding. As a result, the PDF Association chose to bid as a prime contractor on SafeDocs to guide researchers in their understanding of PDF so as to ensure net benefits for everyone.
Our speculation that the DARPA research could potentially result in useful “intermediate artifacts” for PDF stakeholders was proven correct. This isn’t the place to review SafeDocs’ artifacts and outcomes in detail (see Peter Wyatt’s presentation at PDF Days Europe 2022 and his other SafeDocs articles on pdfa.org for that), but for present purposes I’ll highlight the Arlington PDF Model, an open source, complete, machine-readable DOM, derived from the PDF 2.0 specification, as it is at the core of our SafeDocs Phase 3 research.
Thanks to SafeDocs, PDF Association CTO Peter Wyatt will spend 2023 leading a select team of PDF experts and developers to perform, if you will, a tear-down and rebuild of the core PDF specification under the banner of "revolutionizing trust means revolutionizing file format specifications".
Starting from the Arlington PDF Model, Peter’s team will comprehensively study every aspect of the extant specification as they prototype various approaches to systematizing the expression of facts, provisions and requirements, from simple TSV files (today’s Arlington PDF Model), to EBNF, interactive examples, and a much wider range of machine-readable assets and interactive, audience-sensitive, delivery mechanisms. Of course, the “new look” PDF specification must also be capable of producing traditional “static” ISO compliant documents as well as more modern and dynamic equivalents supporting easy "on ramps" to PDF technology for software developers and other stakeholders. This process will undoubtedly result in illuminating errors, ambiguities, and inconsistencies in the formal definition of PDF; these the PDF Association will continue to resolve via the PDF TWG.
In reflecting on the SafeDocs experience to-date, Peter Wyatt says:
Although the originally stated SafeDocs ‘moonshot’ goal was the automatic generation of provably safe PDF parsers from a precise and formal data definition language, we have come to realize that such parsers can only ever be as good as the human interpretation of file format specifications that forms the ultimate "ground truth”.
By shining bright lights on the PDF spec from diverse perspectives, SafeDocs researchers highlighted over 110 issues in the PDF specification and helped move the industry forward by triggering open discussions on identifying and resolving ambiguities, assumptions and taking a stronger security posture. Phase 3 of SafeDocs intends to prototype new methods of delivering the core PDF specification to all stakeholders but with new modalities and “on ramps” supporting a better understanding with less ambiguities, supported by machine readable assets, regardless of how much a priori PDF knowledge or expertise a reader might have.
PDF is a quiet technology; it’s not “cool”. Few want to think about it at all. Some developers, some software publishers, some organizations, and some end users think about PDF a lot because they have realized their reliance on, or have seen opportunities for, this technology… in fact, usually, both. We provide a home for this interest.
Most end users simply assume that PDF works… and they are right. PDF works everywhere, every day. Trillions in existence; hundreds of millions of new PDF documents every hour. Publications, contracts, receipts, resumes, parking tickets, artwork, food packaging, labels, brochures, engineering drawings, books, tax forms, laws … the world runs on PDF.
It’s a good time to be a member of the PDF Association.