PDF Association Newsletter: Issue 31

CONTENTS

  • FEATURE ARTICLE
    • Waiter, there’s a bug in my PDF!
  • PDF ASSOCIATION MEMBERS PRESENT THEMSELVES
    • Global Graphics
  • EVENTS
  • NEW MEMBERS

 

Martin Bailey

Dear Readers,

I spend most of my time working with the Harlequin RIP, and that hit a major milestone in 2013: 25 years of use in production print. That kind of event makes you think back over the last few years and marvel at how much everything has changed. Twenty five years ago PostScript Level 1 was just gaining a lead against proprietary control languages and font formats. Try to imagine that situation in graphic arts today!

In 1991 I was present in San Jose when John Warnock announced what would become PDF as “editable PostScript”. Many of us didn’t really understand the goal. As it turned out that’s because that goal didn’t seem to include production printing, but print companies assumed that anything from Adobe must be good for their needs, and persevered through the early PDF versions, struggling to fill the gaps between what was in the spec and their requirements.

With PDF 1.3 Adobe finally delivered a good foundation for print. It must have felt odd to be bludgeoned into developing a product for a market that didn’t seem to be in their original business plan. Everyone started talking about and building PDF workflows for print. CGATS in the USA also started developing a subset standard based on PDF, called PDF/X, although it took until 1999 to be first published.

I remember the hype at the time, that because “PDF didn’t need to be interpreted, unlike PostScript” it was “so robust that you wouldn’t even need to preflight”. Of course, we all know now that neither part of that statement is true, but that good tooling from a variety of vendors can streamline your workflows anyway.

And then came PDF 1.4. Suddenly PDF was complicated and difficult again because of live transparency. It seemed to take more than another decade before print buyers and print service providers were confident that transparent objects could be rendered and separated properly for print. Some haven’t yet reached that point …

In parallel with the development of PDF itself there was an explosion of interest and investment into print-related standards. Many are around colour, but there were more than ten ISO conformance levels of PDF/X, building on the initial efforts in CGATS. The success of PDF/X triggered work on other subset standards, leading to PDF/A, PDF/E, PDF/UA and PDF/VT … or so it seemed to me; I was working in various PDF/X committees from about 1996 onwards, so I may be slightly biased …

I remember being asked by a journalist sometime around 2005, what I wanted to see in the next version of PDF. I replied that the best thing for the print industry, at least, would be a period of calm and stability with no significant new features in the format so that everyone could get caught up and comfortable with the way things worked. That surprised the journalist because they were used to everything being hyped up on new features in the latest version of both products and formats. But PDF always needed to be treated as a standard, as something that’s exchanged between thousands of different products from hundreds of different vendors.

And now, since 2008, PDF itself is an ISO standard, published as ISO 32000-1. That doesn’t mean that it’s stopped evolving, but it does mean that it’s no longer directed solely by the commercial demands of a single vendor. The involvement of many companies, either directly or through groups like the PDF Association, ensures a very high standard of review and guidance.
Long may that continue!

Martin Bailey
CTO, Global Graphics

 

FEATURE ARTICLE

Waiter, there’s a bug in my PDF!

Given the status of PDF: now 20 years old, and an ISO standard, it may be surprising to some people how many PDF files delivered in workflows for professional usage (in my case for professional print, but also in corporate and enterprise environments) don’t conform to the specification.

“If they’re broken, why don’t I see errors popping up all over the place?” I hear you cry. The answer is simple. For the vast majority of vendors and developers of tools that read PDF files it would be commercial suicide to respond to bad PDFs with an error. In my experience it tends to be the tool that reports an error that’s blamed as the ‘problem’, not whatever created the PDF file badly in the first place.

Tools intended only for on-screen viewing or relatively casual use may deliberately ignore errors or elements of a file that they don’t understand. The page displayed may be incomplete, but you’re getting as much of the meaning of the document as the tool could provide to you, without the irritation and distraction of an error message. Tools which need to produce accurate and complete renditions of the page, such as a RIP for production print, are more likely to raise an error message. But even there the vendor has probably invested a significant amount of developer time investigating errors seen in the field and providing workrounds for them.

Many of the ‘errors’ are pretty minor. A value that the PDF standard states should be an integer might be encoded as a real number with a “.0” at the end, for instance, or a PDF operator that is prohibited between BT (begin text) and ET (end text) commands is used in those places. Most PDF readers added code to ignore those problems and just do the obviously right thing years ago.

But there’s a constant escalation of difficulty going on. I spend a good proportion of my development resource constantly building workrounds for ever more creative ways of not following the standard. What is driving that situation?

Consider, for a moment, how a vendor of PDF creation tools tests that he’s making valid PDF files. It’s not reasonable to assume that every vendor will develop a PDF reader just so that they can use it to validate PDF files from their creation tools (and if they did many would probably have to have the same developers write both, meaning that they would not be properly challenging assumptions and misunderstandings because they’d apply those equally to both creation and validation). The most common way to test is therefore to try to read the PDF files they make in a variety of readers from other vendors. Small companies may even short-cut this approach by doing most of their testing in Adobe Acrobat.

But those third-party reader applications aren’t designed as validation tools. In fact in most cases the vendors who produce them have spent years making sure that they will read every bad PDF file that is thrown at them as well as possible, only reporting errors for situations where no appropriate behaviour can be deduced at a suitable level of confidence.

So if the new creation tool is making bad PDF and it’s tested in a reading tool that’s been developed to accept bad PDFs, there’s no error, and the creation vendor thinks he’s making good PDFs. Until, that is, somebody tries to read a PDF file from it in another 3rd party tool that happens not to include a workround for that specific formatting error, or which sets a higher bar on correctness of rendering or other processing.

It’s a vicious circle. The consumer vendors are always trailing the creation vendors, and all of us are spending time on fixing up the resulting mess instead of working on the features that our customers really want.

So, my plea to all of you is:

  • If you develop reading tools, think about providing a configuration that is less accepting of bad PDF files to help in debugging poorly made PDF files.
  • If you develop creation tools, test with as wide a variety of PDF readers as you can, and use the most restrictive configuration that those tools provide to you.
  • When you find compatibility problems in PDF files, try opening communication with the vendor of the other tool(s) involved. You may be able to make long-term improvements to the quality of your tools and to your test procedures.

Once we’ve got all that sorted, we can move on to the quality of embedded fonts. Don’t get me started on TrueType!

 

PDF ASSOCIATION MEMBERS PRESENT THEMSELVES

Global Graphic

Since 1988 Global Graphics’ Harlequin RIP has powered pre-press and digital print solutions around the world and continues to dominate in commercial and newsprint. Version 10.0 of the Harlequin RIP, launched in 2013, 25 years after Version 1.0, has been developed specifically to enable print shops to grow out into digital print or to expand into digital marketing.  It is the ideal print engine to drive short run digital presses profitably alongside CtP.

Global Graphics Software’s other technologies include the Jaws RIP, used extensively in the wide format segment, and gDoc technology used by enterprise software vendors to develop productivity software applications.

Global Graphics Software’s RIPs offer outstanding performance, quality and reliability for high-volume, ultra high-volume and wide format digital printing applications. The company also has significant expertise and IP in color management, multi-level screening, imposition and trapping technologies. Its research and development team comprises international experts on Page Description Languages, document formats and color science, and the company’s patent portfolio covers many areas of printing and document technology.

Global Graphics has always taken an active role in industry standards setting bodies and associations. Today, Martin Bailey, the Chief Technology Officer is the UK primary expert on the International Standards Organization (ISO) for PDF and for PDF/VT.

Global Graphics Software’s customers include leading brands such as HP, Corel, Quark, Kodak, Agfa, Wasatch and Onyx.   The roots of the company go back to 1986 and to Cambridge University, and, today the majority of the R&D team is still based near this university town. There are also offices near Boston, Massachusetts and in Tokyo. Global Graphics Software’s parent company, Global Graphics SE is registered in France and listed on NYSE-Euronext (GLOG).

 

EVENTS

November 21, 2013: Webinar – Converting emails to PDF/A

The webinar, you will learn strategies for implementing email management systems with PDF/A. (Read more)

November 27, 2013: Seminar – Solving The E-Archiving Problem Using PDF/A

Join the PDF Association & Adobe as we discuss the fundamentals of the PDF/A format- teaching you the ins and outs of avoiding the e-archiving problem to make applying it to your work both easier and more secure. (Read more)

 

NEW MEMBERS

About the PDF Association

The PDF Association is geared towards developers of PDF solutions; companies that work with PDF in document management systems (DMS) and electronic content management (ECM), interested individuals, and users who want to implement PDF technology in their organizations. Although the Association’s original members were predominantly from German-speaking countries, the PDF Association now boasts members from over 20 countries worldwide.

Contact

Association for Digital Document Standards e.V.
PDF Association

Thomas Zellmann
Neue Kantstr. 14
D-14057 Berlin

Phone: +49 30 39 40 50-0
Fax: +49 30 39 40 50-99

info@pdfa.org
http://www.pdfa.org

Unsubscribe

You are receiving this newsletter because you are registered for the PDF/A News. If you do not want to receive further news, please answer with UNSUBSCRIBE.

About PDF Association

Founded in 2006 as the PDF/A Competence Center, the PDF Association exists to promote the adoption and implementation of International Standards for PDF technology.

Leave a Reply