“MIME type” is a term most often associated with email, the web, and other forms of electronic communication. MIME is the acronym for Multipurpose Internet Mail Extension and was originally defined many years ago to support interoperable email that contained more than just plain text.
MIME types, or more correctly MIME media types, describe what a data payload represents and how it is encoded, enabling machines to more easily and reliably share various data. A MIME media type comprises a main type and a subtype separated by a slash (/), with common main types including application, audio, video, model, text and common subtypes such as xml, html, jpeg, and, of course, pdf. Common examples include:
MIME type registration, standardization and publication are the responsibility of the Internet Assigned Numbers Authority (IANA), which maintains a comprehensive list of all registered MIME media types and related documentation.
MIME type registration requires an in-depth understanding of a format, its security and interoperability concerns, and other technical details well suited to the PDF Association and ISO committees responsible for file formats. The Internet Engineering Steering Group (IESG) authorizes organizations wishing to register MIME media types under the standards tree. In a recent meeting, the IESG approved both the PDF Association and ISO TC 171 SC 2 as appropriate bodies for registering MIME media types.
Generic MIME media types such as text/plain and application/octet-stream allow arbitrary text and binary data to be used, however their overuse fails to adequately describe the data and can result in poor user experience, lack of interoperability, or incorrect behavior.
The main MIME media type for PDF is application/pdf. This type is defined by RFC 8118 and is the responsibility of ISO TC 171 SC 2 WG 8. It was last updated in March 2017, prior to the original publication of PDF 2.0, so the PDF Association is proposing a minor refresh and update in the upcoming May 2022 ISO meetings.
The PDF specification identifies two additional file types to facilitate the handling of form and annotation data: FDF and XFDF. FDF is based on the PDF COS syntax used in PDF and defined by ISO 32000-2:2020 subclause 12.7.8, while XFDF (the XML equivalent of FDF) is defined in its own standard ISO 19444-1:2019 Document management — XML Forms Data Format — Part 1: Use of ISO 32000-2 (XFDF 3.0). Both standards are the responsibility of ISO TC 171 SC 2 WG 8.
Previous MIME media type registrations for both FDF and XFDF used a subtype prefix “vnd” indicating they were vendor-specific formats. Since both these formats are now defined by open ISO standards, it is more appropriate to have ISO TC 171 SC 2 re-register them as application/fdf (https://www.iana.org/assignments/media-types/application/fdf) and application/xfdf (https://www.iana.org/assignments/media-types/application/xfdf) respectively, completing the set of application MIME media types needed by PDF. The previous registrations for application/vnd.fdf, application/vnd.xfdf and application/vnd.adobe.xfdf are all acknowledged as deprecated aliases in the IANA documentation.
In resolving Errata #176, the PDF Association's PDF Technical Working Group (TWG) also recently approved an update to the wording in subclause 12.7.8 of ISO 32000-2:2020 to reflect this change to application/fdf.
In addition to updating FDF and XFDF, the PDF Association has now registered model/u3d for ECMA-363 Universal 3D File Format, and ISO TC 171 SC 2, on behalf of WG 7, has registered model/prc for ISO 14739-1:2014, Document management – 3D use of Product Representation Compact (PRC) format — Part 1: PRC 10001.
ISO TC 171 SC 2 WG 7 recently finalized a PDF 2.0 extension to add STEP support to RichMedia annotations (see ISO/DTS 24064 Document management – Portable Document Format – 3D data streams conforming to the ISO 10303:242 (STEP AP242) specification). Combined with the existing MIME media type registrations for STEP (model/step, model/step+xml, model/step+zip, model/step-xml+zip) previously registered by ISO TC 184 SC 4, all 3D model formats defined for use within PDF files now have officially registered IANA MIME media types under the main model type.
As part of resolving Errata #156, the PDF TWG also clearly defined how each 3D model format should be unambiguously declared for both 3D annotations and RichMedia annotations. The resolution defines that RichMedia annotations must use the officially registered IANA MIME media types, while 3D annotations use special 1st class PDF names defined in ISO 32000.
IANA’s MIME media types help 3D PDF manufacturing workflows, CAD vendors, and archivists by enabling effective management of 3D PDF model data. Related resources on the PDF Association’s website, Wikipedia (U3D, PRC, STEP), and at the US Library of Congress format description (U3D, PRC, STEP) have also been updated.
PDF defines an Embedded File Stream dictionary in Table 44 which includes a required Subtype key when the embedded file is used as an Associated File or as an asset of a RichMedia annotation (an errata correction) and must “... conform to the MIME media type names defined in Internet RFC 2046, with the provision that characters not permitted in names shall use the 2-character hexadecimal code format described in 7.3.5, "Name objects"”.
As an example, here is a PDF 2.0 fragment showing how a RichMedia annotation references an embedded file stream with a specific MIME media Subtype for U3D via a File Specification dictionary (#2F is the hexadecimal code for the “/” separator between the MIME main type and subtype):
6 0 obj % File Specification dictionary (see Table 43 in ISO 32000-2)
/UF (\357\273\277example.u3d) % UTF-8 text string
/Desc (\357\273\277An example U3D model) % UTF-8 text string
/EF << /F 7 0 R >>
/AFRelationship /Source % Associated File relationship
7 0 obj % Embedded file stream (see Table 44 in ISO 32000-2)
/Subtype /model#2Fu3d % must be a valid MIME media type (RFC 2046)
/Params % Embedded file parameter dictionary (see Table 45 in ISO 32000-2)
%... FLATE-compressed U3D data stream …
8 0 obj % RichMedia annotation (see Tables 166 and 333 in ISO 32000-2)
/AF [ 6 0 R ] % Associated file
/NM (\357\273\277An example U3D RichMedia annotation) % UTF-8 text string
Associated Files were originally introduced in PDF/A-3 (ISO 19005-3) and later adopted into PDF 2.0 “... to provide a means to associate content in other formats with objects of a PDF file and to identify the relationship between them.” (ISO 32000-2, 14.13)
The relationship between PDF content and associated embedded data can be specified by an AFRelationship entry in the File Specification dictionary (Table 43, ISO 32000-2:2020). PDF 2.0 defines various values for AFRelationship such as Source, Alternative, FormData and Schema that allow implementers to identify a semantic relationship. PDF 2.0 states that if the MIME type is not known, the generic value application/octet-stream shall be used, effectively declaring the data as arbitrary binary data. Although the AFRelationship key allows for a much richer association between content and embedded data, simply including a descriptive MIME media type helps PDF processors improve the end-user experience for interacting with arbitrary embedded data.
The Associated File feature is generalizable and can be used to semantically associate any data file with any PDF object, providing a far richer document experience. The PDF Association’s PDF 2.0 Application Note 002: Associated Files provides further guidance.
The formalities of IANA registration and updates to MIME media type definitions directly benefit those utilizing advanced forms and annotations functionality in PDF, as well as industries such as manufacturing, aerospace and automotive, which rely on 3D PDF for reliable interchange and interoperability. The PDF Association wishes to express its appreciation to IANA for its support in completing this important work.
PDF, PDF-derived, and PDF-adjacent technologies are ubiquitous worldwide, and thus must interoperate with diverse applications and content management systems. The PDF Association’s responsibility therefore goes beyond a single specification; it involves ensuring that the technical ecosystem that supports the file format, such as MIME media types, are all current and correct. It is details such as this that enable interoperable workflows where files are smoothly communicated, shared, and managed between applications developed by diverse vendors. As part of this work, the PDF Association is also proposing a refresh to RFC 8118 at the next ISO meeting in May 2022.
Peter Wyatt is the PDF Association’s CTO and an independent technology consultant with deep file format and parsing expertise, who is a developer and researcher actively working on PDF technologies for more than 20 years. He is Project co-Leader of ISO 32000 (the core PDF standard), co-Chairs the PDF Association PDF TWG and is the PDF Association’s Principal Scientist leading …
Peter Wyatt is the PDF Association’s CTO and an independent technology consultant with deep file format and parsing expertise, who …