It’s a question that vexes vendors of web-based solutions everywhere: why do people still insist on PDF files? And why does PDF’s mindshare keep going up?
“PDF is such antediluvian technology!” they say. “It’s pre-web, are you kidding me? It’s so old-fashioned! Let’s talk about HTML5!”
I’ve spent 21 years in PDF technology, but I’ve wondered the same thing many times. If you’d asked me in 2007 whether I thought PDF’s footprint would grow even larger by 2017, I would have guessed “no”. I’d have been wrong.
PDF technology combines a set of capabilities that together make PDF unique. Of course, that’s true elsewhere, but PDF is different because it brings all these capabilities together in a single, self-contained file-format; a transactable object that's generally independent of bandwidth, servers, CSS, fonts, reader software and every other sort of dependency.
As it turns out, this specific feature is priceless, even though we have no juicy term to describe it. “Self-contained” sounds very dry.
Why does self-contained matter? There are, literally, tens of millions of examples, but one such is your monthly bank statement. You and your bank are very happy with current account information in HTML. Your monthly statement, however? The final and referenceable recording of your transactions is still a PDF. Because a PDF is a record; a web-page is just… a magic moment that happens between client and server for some small period of time.
HTML is an experience; PDF is a document.
The fact that PDF so ably serves as a document goes beyond the fact of PDF as a recording. PDF delivers something else: the lowest common denominator of communication (the printable page) with unmatched fidelity to the author’s intent. In the industry, we call that feature portability.
Let’s unpack that.
The choice confronting those with documents they need to share with others boils down to three basic options:
The case against (1) exchange of source documents is fairly straight-forward. Some users may not have access to software that successfully opens and reads such files. Source files usually include dependencies of various sorts (fonts are a typical example), making it harder to count on effectively sharing source documents beyond those users who are known to be suitably equipped. Additionally, sharing source-files may expose document history or other private information, original art files, and more. In practice, the ratio of PDF to Word files posted online is about 10:1.
Given the ubiquity of the web-based resources (2), many wonder why paginated information persists. After all, they reason, HTML is super-flexible, and easy to manage in detail. What they miss is that for web-based content, nothing's really delivered; there are just... whatever's happening between server and client at that moment. Who trusts a browser session like they trust a PDF? No-one.
C'mon, they say. HTML is going to replace PDF once the millennials are making the decisions, right?
Not so fast. The need for permanence and portability is baked into the concept of a “document”; these attributes just aren’t in HTML’s skill-set, much as complete flexibility in representation is not in PDF’s skillset. HTML files are as close as the Save menu-item in every browser, but even millennials don’t keep or share HTML files (3) as documents. It’s just too flaky.
The persistent reality is the users share View-only links to their Google Docs all the time – and nonetheless keep making and sending more PDF files (4). Viewing a document on some remote server, after all, is not the same thing as distributing a final-form document. People trust cloud services, but when it comes to documenting something, they like to be sure as well.
When you have a PDF file, you know what you’ve got. Barring the very occasional corrupt file, there are no surprises. As it happens, tolerance for surprises in electronic document formats is extremely low.
Want to learn about how much PDF technology can do beyond the printable page? Come to PDF Day, January 29, 2018 in Washington DC!