It’s a question that vexes vendors of web-based solutions everywhere: why do people still insist on PDF files? And why does PDF’s mindshare keep going up? “PDF is such antediluvian technology!” they say. “It’s pre-web, are you kidding me? It’s so old-f …PDF Association technical resources: an overview
PDF is PDF because files produced with one vendor’s software can be read using a different vendor’s software with no loss of fidelity. Interoperability is key to our industry. The PDF Association is a international membership organization dedicated to …2022: The last year of paper for records-keeping
NARA (The National Archives and Records Administration) is the final depository for the long-term records generated by all other agencies of the U.S. Federal Government. The agency has a key role in preserving the cultural history of the republic as we …PDF 2.0 examples now available
The PDF Association is proud to present the first PDF 2.0 example files made available to the public. Created and donated to the PDF Association by Datalogics, this initial set of PDF 2.0 examples were crafted by hand and intentionally made simple in construction to serve as teaching tools for learning PDF file structure and syntax.PDF 2.0 interops help vendors
The PDF 2.0 interop workshops included many vendors with products for creating, editing and processing PDF files. They came together in Boston, Massachusetts for a couple of days to test their own software against 3rd party files.
What is PDF/A, what does it have to offer, and how can I best use it? Answers to these and other reader questions dealing with long-term archiving and PDF/A are provided on this page. Do you have a question that is not yet addressed here? Just send us an E-mail using the red Ask Us About button in the right-hand column, and youll receive an answer from the experts at the PDF/A Competence Center.
This is a primary goal of PDF/A. Digital documents should remain in electronic format, giving the user a wide range of additional features, e.g. like full-text searching instead of manually looking through paper dossiers or file cards.
PDF/A is a subset of PDF that eliminates certain risks threatening the one-to-one future reproducibility of the content. PDF/A forbids dynamic content to ensure that the user sees the exact same content both today and for years to come. Everything that is required to render the document the exact same way, every time, is contained in the PDF/A file: fonts, colour profiles, images etc. PDF/A is also an ISO standard, guaranteeing that future software generations will know how to open and render PDF/A files.
Open Document Format is a file format for office use, based on XML. ODF has some of the good properties also found in PDF/A: the specification is publicly available, it is an international standard (ISO/IEC 26300:2006). However, ODF is not self contained and (currently) nowhere nears as widely distributed as PDF.
PDF/A fares just as well as any other file format. More commonly, the data medium itself (CD-ROM, hard disk, etc.) is the reason that files cannot be read, and not the file format.
Yes. You might however have to take into consideration certain preparatory steps, for example ensuring that annotations are also carried over into the PDF/A file.
Yes. PDF/A gives you the possibility to save various different metadata (for example the copyright) in the document. Extensible Metadata Platform (XMP), a technology that unifies different metadata methods, is used for the metadata in a PDF/A file.
The PDF/A and PDF/X standard were created by the ISO such that they are to a great extent compatible with each other, i.e. a PDF file can be both PDF/A and PDF/X conforming.
The most important differences are:
PDF/A does not require the following aspects that are mandatory in PDF/X, or least are very prevalent:
PDF/A has additional requirements for metadata not found in PDF/X-1a and PDF/X-3:
PDF/X-1 and PDF/X-3, on the other hand, forbid certain elements that are permitted in PDF/A:
In addition, PDF/A does not require that fonts for invisible text be embedded (commonly used with scanned pages that have an invisible text from an OCR text recognition placed over them). PDF/X requires all font to be embedded, even for invisible text.
There is another important difference with respect to the so-called OutputIntent:
If you have a PDF/A file and want to make it PDF/X (without losing the PDF/A characteristics), it could be difficult in the following cases:
In all other cases it should work quite easily.
The other way around is usually simple. As a rule, every PDF/X-1a or PDF/X-3 file can be saved as PDF/A-1b.
This can be tested with Acrobat Professional 8. The Preflight function contained therein offers different possibilities, e.g. save a PDF/A as PDF/X, or save a PDF file as PDF/A and PDF/X in one step. The repair feature in Preflight can also help you perform some corrections to the files.
The ISO standard requires that future PDF viewing applications must be backward compatible, so that they are capable of correctly displaying older versions of PDF/A.
If a PDF/A file is created from a digital text document, the text will automatically be recognized. Normally you dont have to worry about a PDF/A file being text searchable, unless the file was created from a scanned paper document or image. But even then, OCR can be used to make it searchable. In this case, however, only PDF/A-1b is possible and not the more stringent PDF/A-1a.
Since PDF/A-1a lays especially high requirements on the fonts used and on the file structure, this level of conformance is recommended over PDF/A-1b for an exact text searchability, text extraction, and for the reuse of content. PDF/A-1b supports text search, may however not find all instances of a text if there are anomalies with character coding.
It should be ensured that the links in a document are still legible, even if they dont point to an active destination.
PDF files with dynamic objects like audio and video cannot be converted to PDF/A. PDF/A must guarantee an exact reproducibility, which is not possible with embedded objects like sound or movies. These types of objects usually require an external player (and quite often in a specific version). There is no guarantee that the player application will be available in the future.
A PDF/A file might be marginally larger than the original PDF file it was created from (provided they dont use different image resolutions or compression methods). Fonts are embedded in a PDF/A file (which is often also the case in normal PDF files) and more information is stored in the metadata. Some colour profiles could, in certain cases, lead to a much larger file size, but this is rare and is highly dependent on the particular case.
CAD drawings (Computer Aided Design) are electronic technical drawings that can be produced by several different software applications, each using its own proprietary file format. The long-term archiving of CAD documents can cause difficulties due to the native electronic formats. PDF/A is therefore very suitable for archiving CAD drawings, unless the drawings contain 3-D objects (which are not permitted in PDF/A). In this case, a possible solution is the PDF/E standard (E for Engineering).
Yes. ZIP file compression is permitted, and images can be compressed using JPEG compression. LZW compression is, however, forbidden.
No. Encryption is not permitted in PDF/A files. If, for example, a file requires a password to open it, then either a person who knows the password or a digital key must be available. The content of a PDF/A file must however always be accessible, which is why encryption is not allowed. A possible way to protect sensitive data in a PDF/A file is to put access constraints on the storage medium where the file is located (e.g. password access to the folder).
This is quite possible. PDF/A is based on PDF 1.4. If the PDF file is using features that were first introduced with PDF 1.5, 1.6 or 1.7, these could be lost with the conversion to PDF/A.
No. Bookmarks are permitted in PDF/A.
Accessibility and PDF/A go hand-in-hand. Since both PDF/A-1a and accessible PDF files have special requirements for the file structure and on the fonts used, it is relatively unproblematic to create accessible PDF files that also conform to the PDF/A standard.
Most forms of comments (annotations and notes) are permitted. There are however certain requirements, for example the comments must be visible, and they cannot be of the type sound or an attached file. Some types of comments that were introduced after PDF 1.4 (the basis for PDF/A) are not permitted, for example Polygon tool.
We use a dc:author container bag, but in Acrobat 8s preflight that throws an error: The Author field in the documents Info dictionary does not match the Author entry in the documents XMP Metadata. Using a single author does not generate the error.
The issue you are running into is due to a clash between the old document info metadata approach in PDF and the more modern XMP based approach, both of which can be used in PDF though the two if used at the same time in a PDF/A file must meet certain requirements of the PDF/A standard.
For the author entry the
Now while the PDF/A standard intended to focus on XMP
How can you do that with a one dimensional string field on one side and a not-sorted list of entries on the other side? You only can achieve something reasonable by downgrading the list to something that can be mapped one to one to the simple string: only allow one entry in the list. Thats what the PDF/A standard does.
So this is the explanation why you are seeing what you are seeing.
The solution is simple (though admittedly not always easy to achieve): once you do not have the author entry in the document info anymore (but only the entry/entries in the XMP metadata in dc:creator) you are free to use the entry as a bag with one or several author name entries.
Acrobat 8/9/X Professional comes with its own OCR software that can be used to convert scanned pages into searchable text.
The Adobe Reader 8 (as well as 9 and X) has a special modus for displaying PDF/A files compliantly. There are also a number of 3rd party products that support PDF/A compliant viewing. The user has a choice of products.
No, there are some free PDF viewers available for different operating systems.
Yes, PDF files can be converted to PDF/A. It might be, depending on the original PDF file, that not all features can be transferred to PDF/A. For example: PDF/A is based on PDF 1.4. There are features in newer PDF versions (like transparency and layers) that were not (fully) introduced with PDF 1.4 and are therefore not supported by PDF/A. In a case such as the one mentioned, the transparency has to be removed and the layers flattened in order to create a PDF/A-1 document. The next version of PDF/A PDF/A-2 is based on the PDF specification 1.7 and will allow a lot of the newer features.
Both Acrobat and other software programs can create PDF/A files with batch processing. For example, several files or an entire folder can be processed. There are different solutions available on the market that support high volume, automated processing, suitable for businesses and agencies.
Usually a program that creates PDF/A files will give a warning if a font file cannot be embedded. The problem is not very prevalent with standard western fonts, since most font developers allow their forts to be embedded in files. With special fonts (for example Japanese fonts) there could be difficulties, since a lot of the fonts are copyrighted and cannot be copied.
There are number of products that support PDF/A-1a, including Adobe Acrobat 8/9/X Professional as well as products from callas software, Compart AG, PDFlib and PDF Tools AG. The list of applications that can create PDF/A-1a is constantly growing.
No. It is not possible to create PDF/A with Adobe Acrobat 6. The Adobe Acrobat 8 Professional version is the first version of Acrobat that fully supports PDF/A.
PDF is not the same as PDF/A. You can however also create PDF/A with Microsoft Office 2007. Look closely in the properties. PDF/A is recommended over normal PDF files. By the way, the PDF conversion is only possible if you download a separately available plug-in (Save-As-PDF) from the Microsoft website.
There are solutions available on the market that are geared towards converting large volumes of files (from different formats) into PDF/A. In order to estimate the amount of time youll need, you have to take into account the original file format, homogeneity, how the files are stored, as well as several other factors.
There are a number of possible ways to create PDF/A files, depending on where the original information is coming from:
It is correct that OpenType fonts cannot be embedded in PDF/A files. However, OpenType fonts are often converted to another type (TrueType or Type1) when they are embedded, so creating a PDF/A file should usually not be a problem. If Acrobat 8 recognized the file that you converted to PDF/A as being compliant, then a conversion of the font type probably took place. You can verify the font type of embedded fonts using the Preflight verification report (Check box: Show detailed information about document, and then look under Fonts).
When a PDF/A file is created, the program ensures that the fonts are embedded. If the fonts are not embedded, you dont have a valid PDF/A file. You can verify if fonts (and which ones) are embedded in any PDF file by checking under Properties in Acrobat and the Adobe Reader. In addition, PDF/A validation tools will inform you if fonts have or have not been embedded, and whether the files are therefore PDF/A conforming or not.
There are a few tools on the market that will do this. In addition to Adobe Acrobat Professional 8, there are the pdfaPilot and pdfInspektor4 from callas software, the 3-Heights PDF Validator from PDF Tools, the LuraDocument PDF Validator from LuraTech, PDF/A Live! from intarsys, the PDF/A Longlife Suite from Seal Systems and the PDF Appraiser from Apago.
Yes. It is permitted to digitally sign PDF/A files. There are a number of tools, strategies and software solutions available on the market for signing PDF/A files electronically. Even Acrobat Professional can be used to digitally sign PDF/A files.
The ISO Technical Committe 171, SC2 has also prepared and published a document containing Frequently Asked Questions (FAQs) about PDF/A.
This FAQ may be freely distributed and/or translated in its entirety. The current authoritative version of this FAQ is maintained at both NPES (www.npes.org) and AIIM (www.aiim.org – see especially the section managed by the AIIM PDF/Archive committee).