It’s a question that vexes vendors of web-based solutions everywhere: why do people still insist on PDF files? And why does PDF’s mindshare keep going up? “PDF is such antediluvian technology!” they say. “It’s pre-web, are you kidding me? It’s so old-f …PDF Association technical resources: an overview
PDF is PDF because files produced with one vendor’s software can be read using a different vendor’s software with no loss of fidelity. Interoperability is key to our industry. The PDF Association is a international membership organization dedicated to …2022: The last year of paper for records-keeping
NARA (The National Archives and Records Administration) is the final depository for the long-term records generated by all other agencies of the U.S. Federal Government. The agency has a key role in preserving the cultural history of the republic as we …PDF 2.0 examples now available
The PDF Association is proud to present the first PDF 2.0 example files made available to the public. Created and donated to the PDF Association by Datalogics, this initial set of PDF 2.0 examples were crafted by hand and intentionally made simple in construction to serve as teaching tools for learning PDF file structure and syntax.PDF 2.0 interops help vendors
The PDF 2.0 interop workshops included many vendors with products for creating, editing and processing PDF files. They came together in Boston, Massachusetts for a couple of days to test their own software against 3rd party files.
PDF/UA is the ISO standard published as ISO 14289 in July 2012 defining the creation and processing of accessible PDF. This article is directed primarily at implementers, quality assurance (QA) and technical product managers interested in supporting accessibility in PDF. It describes the purpose and function of the Matterhorn Protocol, and explains how developers may use this document to address PDF/UA conformance in a systematic and reliable manner.
As a format, PDF was designed to provide reliable, high-quality visual representation of any two-dimensional content, regardless of peculiarity of design, source format or viewing environment. PDF does this better than any other technology; indeed, it has no serious competition.
Accessibility, however, is a broad and complex subject. In addition, the flexibility of PDF forces developers to cover an exceptionally wide range of use-cases. At the same time, the relative vagueness of Tagged PDFs definition in ISO 32000-1 has not encouraged third-party development.
The PDF/UA standard provides developers with a clear road map to understanding how to do Tagged PDF right, but nonetheless requires substantial research for those who arent already familiar with accessibility requirements. Most major PDF software developers are willing to implement Tagged PDF, but they want to know whats really important. Thats where the Matterhorn Protocol fits in.
A PDF Association publication, the Matterhorn Protocol specifies all possible ways to fail PDF/UA. As such, its a set of algorithms providing the practical rules for implementing software that creates, processes or presents accessible PDF.
The Matterhorn Protocol helps set priorities in both research and execution. It enables developers not yet fully familiar with every detail in PDF/UA to get to work right away, accelerating all aspects of code and product development.
The basic approach for implementing the Matterhorn Protocol is to map the Failure Conditions to the various tasks implied by the specific word-processing, content extraction or other context.
PDF creation is the ideal place to implement PDF/UA conformance for many reasons, not least because so many checkpoints requiring human validation may be inferred from the structures created by the author. The PDF generator must ensure that semantic tables in the source, for example, are properly tagged with table tags and attributes in the output PDF.
Organizations that adopt accessibility standards want the capacity to check the accessibility status of their websites and PDF files. Implementers will need to consider how to make the human validation component as streamlined as possible while accommodating the variety of cases the software may encounter. See Access for Alls PDF Accessibility Checker, PAC 2.0, for the first software implementation of a PDF/UA validator based on the Matterhorn Protocol.
Its always preferable to re-create a PDF than to edit an existing file to ensure good tagging, but its not always possible. In many cases, existing tagged PDF files that fail human validation must be corrected rather than re-created from the source application. Depending on the precise design objectives, this sort of implementation can range from trivial to challenging. Its relatively easy to allow users to efficiently check and correct alternative text attributes in Figure tags. Its much less easy to produce a graphical user interface (GUI) allowing users to easily and reliably change the set of content enclosed within Figure tags.
Perhaps the most challenging task in the world of accessible PDF would be that of bringing untagged PDF files into conformance with PDF/UA. For such cases, human validation of logical reading order and valid structure type selection is difficult to avoid. Here, the Matterhorn Protocol provides both a means of verifying conformance and a way to document the human effort required to achieve it.
When PDF viewing implementations can rely on files validated by the Matterhorn Protocol end-users may be assured of a high-quality result in applications that use Tagged PDF. These include, besides the obvious case of Assistive Technology (AT), mobile devices, context extraction, search engines, business intelligence systems and other applications utilizing semantic information.
Designed by and for developers, the Matterhorn Protocol is a practical guide to achieving PDF/UA conformance you can start implementing today. Download the Matterhorn Protocol now.
Download this article as a PDF/UA file.