PDF Day France will be the first French-speaking event of the PDF Association, organised by our member ORPALIS. It will take place in Toulouse which is the home ground of Airbus and we are very happy that Airbus will present a case study around its usage of PDF in their document management environment!Electronic Document Conference: Call for Papers
Prospective presenters at the Electronic Document Conference 2019 are invited to submit high-quality original proposals for 25-minute presentations on subjects of interest to developers and technical product managers concerned with electronic document implementations.Have we passed ‘peak PDF’?
How do we gain insight into how users’ views of documents are shifting? Google Trends is an increasingly interesting source of high-level marketplace data. By aggregating Google’s search data over time, reporting a term’s popularity as compared with all other searches.Participating in the PDF Techniques Accessibility Summit
The PDF Techniques Accessibility Summit’s objective is to establish a broad-based understanding of how PDF files should be tagged for accessibilty. It’s an opportunity to focus on establishing a common set of examples of accessible PDF content, and identify best-practice when tagging difficult cases.Members supporting PDF features!
The typical adoption curve for PDF technologies until approximately 2007 tended to track with that of the original PDF developer. Since then the marketplace has shifted; it’s no longer clear that Adobe drivesPDF feature support worldwide. Accordingly, we are happy to report that adoption of PDF 2.0 continues apace, with new vendors announcing their support every month.
Metadata is data about data. It describes the content, character, provisos, and other characteristics of data. The term metadata was coined in the 1960s by Jack Myers.
Metadata helps us to understand the world around us. Without it, we are lost. Metadata also exists outside the computing world, on plans and maps, for example.
There are many reasons for using metadata. Companies can benefit from using it when they set up and maintain their databases. If time is spent on the intelligent management of document data at the document creation stage, this initial extra effort is sure to pay off in the long term.
For example, the accounting center can access this metadata later on and the transmission of data may be easier. Search processes can also be significantly improved if uniform concepts are used.
Metadata helps to provide an optimum overview of data and ideal data handling in many areas. This includes the following examples:
And many, many more…
Metadata minimizes the work effort required for many tasks in companies and authorities. Employees can find the data records they require more quickly and do not have to look for additional information on certain data elsewhere. With regard to personnel fluctuations, metadata prevents the loss of information that might otherwise be mislaid. Metadata also provides many automation possibilities that could not be realized without it. Further time savings result from the fact that metadata avoids data record duplicates within an organization.
Correct metadata simplifies the entry of transactions at the accounting office. Another plus point is that a greater number of people can utilize the data. This saves time because data record duplicates at different branches can be avoided.
How is metadata organized in PDF? In Adobe Acrobat, you can view and change metadata in the Document Properties dialog box.
The info dictionary has been a part of PDF since PDF version 1.0. This area belongs to the document itself and contains a collection of name/value pairs. Predefined pairs include Title, Author, Subject, Keywords, and others. You can also add your own values
The PieceInfo dictionary was introduced with PDF 1.3 (Acrobat 4). This dictionary is either document- or page-related. The Application Private Data section is used by Adobe Photoshop and Adobe Illustrator.
XMP is a more recent development. It was introduced with PDF 1.4 (Acrobat 5). XMP is based on RDF (Resource Definition Framework). RDF is a W3C standard for XMP-based metadata (for more information, see www.w3.org/RDF). XMP can be linked with XObjects on document pages. XObjects are also known as images and repeating objects. In addition, XMP can be linked with fonts and ICC profiles.
Object data was introduced with Acrobat 7 and is based on PDF 1.6. This area is connected with individual content elements. The name/value pairs can be realized using character strings, digits, or logical links.
Measurement properties have also been permitted since PDF 1.6. These properties are page-related and supply information on sizes and units of measure. This allows PDF units to be linked with units from the real world (for example, 1cm = 1km).
Measurement properties have been permitted since PDF 1.6
XMP (Extensible Metadata Platform) enables metadata to be integrated into all relevant Adobe applications (and in other places) using a uniform schema.
|Promotes Intelligent Media||Fosters re-use, re-purposing, re-expression across domains Promotes brand equity, intellectual capital & other intangible asset||Self Describing||Not limited to a specific schemaEvery file can have meta data|
|Open Platform||Enables metadata capture, preservation & propagation,across devices, applications, file formats, institutions||Accessible||Based on industry standards (W3C)Openly available Extends metadata beyond the context of a database|
|Intelligent media based on XMP|
Key Elements of XMP
Framework: XML structure for storing information
Data package: How and where to store and call up information
Specification: Description of the standard and its relationship to other standards
XMP toolkit (SDK): Available free of charge, open source
Modifiable fields: User interface for user interaction with metadata
Platform: Standardized access to metadata throughout Adobe CS
XMP is based on the W3C standard. The Adobe metadata framework constitutes the first ever wide-scale, comprehensive, and practical use of RDF (Resource Description Format). The Adobe XMP Platform has the following elements:
XMP framework: RDF framework or rendering metadata from multiple schemas
XMP schema: Schema for the description of properties, contained in namespaces
XMP data package technology: Method for embedding XML fragments in binary code
XMP SDK: Support for third-party solutions (interface and enhancements)
RDF specifies IDs in XML sequences, structured by source, property, and value (or alternatively subject, predicate, and object). RDF schemas define vocabulary. Adobe designed the standard XMP schemas. The XMP framework permits the integration of any schema that is structured in accordance with the specification. Area-specific schemas (such as IPTC and NewsML) can also be described in XMP data packages.
An example of a XMP Schema for Video.
|xmp:CreateDate||Date||Internal||The date and time the resource was originally created.|
|xmp:CreatorTool||AgentName||Internal||The name of the first known tool used to create the resource. If history is present in the metadata, this value should be equivalent to that of xmpMM:Historys softwareAgent property.|
|xmp:MetadataDate||Date||Internal||The date and time that any metadata for this resource was last changed. It should be the same as or more recent than xmp:ModifyDate.|
|xmp:ModifyDate||Date||Internal||The date and time the resource was last modified. NOTE:The value of this property is not necessarily the same as the files system modification date because it is set before the file is saved.|
The XMP Basic schema provides properties that provide basic descriptive information
|pdf:PDFVersion||Text||Internal||The PDF file version (for example: 1.0, 1.3, and so on).|
|pdf:Producer||AgentName||Internal||The name of the tool that created the PDF document.|
The Adobe PDF schema provides a set of properties used with Adobe PDF documents.
The schema namespace URI is http://wwwns.adobe.com/pdf/1.3/
The preferred schema namespace prefix is pdf.
The Dublin Core metadata element set (also known simply as Dublin Core) is a vocabulary with fifteen properties that describe the properties of the source. Dublin Core is part of a larger set that consists of metadata vocabulary and technical specifications, overseen by the Dublin Core Metadata Initiative (DCMI).
The complete set of vocabularies, DCMI Metadata Terms [DCMI TERMS], also contains a set of source categories (resource classes) – the DCMI Type Vocabulary [DCMI-TYPE].
The conditions in the DCMI vocabularies are intended to be used in conjunction with other, compatible vocabularies and in combination with application profiles, on the basis of the DCMI Abstract Model [DCAM].
The name Dublin is due to its origin at a 1995 invitational workshop in Dublin, Ohio. Core because its elements are broad and generic, usable for describing a wide range of resources.
|dc:contributor||bag ProperName||External||Contributors to the resource (other than the authors).|
|dc:coverage||Text||External||The extent or scope of the resource.|
|dc:creator||seq ProperName||External||The authors of the resource (listed in order of precedence, if significant).|
|dc:date||seq Date||External||Date(s) that something interesting happened to the resource.|
|dc:description||Lang Alt||External||A textual description of the content of the resource. Multiple values may be present for different languages.|
|dc:format||MIMEType||Internal||The file format used when saving the resource. Tools and applications should set this property to the save format of the data. It may include appropriate qualifiers.|
|dc:identifier||Text||External||Unique identifier of the resource.|
|dc:language||bag Locale||Internal||An unordered array specifying the languages used in the resource.|
|dc:relation||bag Text||External||Relationships to other documents.|
|dc:rights||Lang Alt||External||Informal rights statement, selected by language.|
|dc:source||Text||External||Unique identifier of the work from which this resource was derived.|
|dc:subject||bag Text||External||An unordered array of descriptive phrases or keywords that specify the topic of the content of the resource.|
|dc:title||Lang Alt||External||The title of the document, or the name given to the resource. Typically, it will be a name by which the resource is formally known.|
|dc:type||bag open Choice||External||A document type; for example, novel, poem, or working.|
The table shows how entries and properties from the DocInfo and XMP areas relate to each other and can be translated.
|Document information dictionary||XMP|
|Entry||PDF type||Property||XMP type|
|Title||text string||dc:title||Lang Alt|
|Author||text string||dc:creator||seq ProperName|
|Subject||text string||dc:description[x- default]||bag Text|
* Supported by Acrobat & PDF
The schemas defined in this document are core schemas that are believed to be applicable to a wide variety of needs. If possible, it is always desirable to use properties from existing schemas. However, XMP was designed to be easily extensible by the addition of custom schemas. If your metadata needs are not already covered by the core schemas, you can define and use your own schemas.
If you are considering creating a new namespace, observe the following:
Avoid including properties that have the same semantics as properties in existing namespaces.
If your properties might be useful to others, try to collaborate in creating a common namespace, to avoid having a multitude of incompatible ones. To define a new schema, you should write a human-readable schema specification document. The specification document should be made available to any developers who need to write code that understands your metadata.
NOTE: Future versions of XMP might include support for machine-readable schema specifications, but such support will always be in addition to the requirement for human-readable schema specification documents.
There is a range of solutions that can be used to automatically or manually add metadata to PDF. This overview shows some of the possibilities:
XMP Toolkit 4.0 Labs