A PDF Page Is a Painting

Why “reading order” in PDF is irrelevant to accessibility

This article attempts to explain the concept of “reading order” in PDF files. Why is this necessary?

  1. Users are often frustrated by inconsistent and often illegible results when trying to read PDF files on mobile devices, search for PDF content online, or when using assistive technology (AT).
  2. Those tasked with ensuring accessibility or Section 508 compliance often focus on objects rather than logical structure, thus missing the mark.
  3. Software developers are (understandably) confused by “reading order” as presented in today’s PDF Reference (ISO 32000).

Many have come to use the term “reading order” as functionally synonymous with the logical order provided by PDF tags, but this interpretation is incorrect.

A screen-shot showing a simple example of how painting order and logical order may differ.

The PDF Paintbrush

When you create a PDF, you’re painting a picture. Your “paintbrush” is the combined effect of the software used to create the source document and the software you’ve chosen to convert your source document into PDF.

Like brushstrokes, each character, each line and each image is created independently, but interact to produce particular visual effects. On a PDF page, objects are connected by a coordinate system and little else. There’s no logical connection between the letters comprising a word; characters simply happen at a series of locations on the rendered page.

As originally designed, PDF is a system for painting on a page. There’s no innate concept of words, sentences, paragraphs, columns, headings, images, tables, lists, footnotes – any of the semantic structures that distinguish a “document” from a heap of letters, shapes and colors. PDF is fundamentally about how the document appears on the page, not how it looks when abstracted from the page.

