PDF Association logo.

Facebook
Twitter
YOUTUBE
LINKEDIN
XING

Newsletter

Get the latest information!

Latest Posts
Diagram of the workflow described herein.

The PDF Techniques Accessibility Summit’s objective is to establish a broad-based understanding of how PDF files should be tagged for accessibilty. It’s an opportunity to focus on establishing a common set of examples of accessible PDF content, and identify best-practice when tagging difficult cases.

Logo for the PDF Techniques Accessibility Summit

The PDF Techniques Accessibility Summit will identify best-practices in tagging various cases in PDF documents. Questions to be addressed will likely include: the legal ways to tag a nested list, the correct way to caption multiple images, the appropriate way to organize content within headings.

Screen-shot of thumbnail images.

My hospital emailed me a medical records release form as a PDF. They told me to print it, fill it, sign it, scan it and return it to the medical records department, in that order. In 2018? To get the form via email (i.e., electronically), yet be asked to print it? Did the last 20 years just… not mean anything! So I thought I’d be clever. I’d fill it first, THEN print it. Or better yet, never print it, but sign it anyhow, and return it along with a note making the case for improving their workflow. The story continues…

Can computers understand PDF documents as humans, or better?

Alexey-Subach, Dial LabAlexey Subach, technical lead at Dual Lab, will be hosting a presentation titled “Can computers understand PDF documents as humans, or better?” at the PDF Days Europe 2018.

Session Description: Back in 2015 computers for the first time beat the human record in image recognition.
A lot of success has been surrounding artificial intelligence recently, including beating world’s best Go player in 2016, recognizing phone speech better than humans in 2016, first self-driving taxis in Phoenix in 2017 and pneumonia detection at a level exceeding practicing radiologists.
Still, computers are not flawless and can be struggling even with relatively simple tasks, not to mention adversarial examples that are being developed for a few artificial intelligence applications.
Understanding digital documents – and PDF in particular – is a complex yet very important topic needed in fields of accessibility, automation and information retrieval.
What is the state of the art and what are the short-term goals? How can we as PDF producers and consumers speed up the process? What are the limitations and ways to overcome them? Let’s try to answer these questions together.
We propose an approach that includes:

  • building a database of PDF documents suitable for training
  • preparing the tagging tree to serve as ground truth
  • evaluating the performance of the trained model

Presenter: Alexey Subach is a technical lead at Dual Lab, a service provider company known for its expertise in PDF, graphics arts and document workflow systems. Passionate about PDF, Alexey focuses on providing users with straightforward APIs for utilizing low-level features of the specification, not losing richness and flexibility of the format. He is curious about new areas of technology, making effort to use his mathematical and analytical background to dive deeper into their foundations and look for new promising applications.

Check out the detailed programme:
https://www.pdfa.org/pdf-days-europe-2018-schedule-of-sessions/
Direct link for registration:
https://en.xing-events.com/pdf-days-europe-2018.html