PDFlib pCOS (PDF Information Retrieval Tool) provides a simple and elegant facility for retrieving any information from a PDF document which is not part of the page contents. For example, PDF metadata, interactive elements (links, form fields, etc.), or page dimensions can easily be queried with pCOS.
With pCOS you can extract a variety of interesting items and create output for different purposes. By processing multiple PDF documents with a single call you can easily create summaries of document info entries, page formats, fonts, or any other property. Combined with tabular output this provides a powerful PDF administration tool.
There are many application scenarios for the PDF Information Retrieval Tool PDFlib pCOS within PDF workflows, but you can also use PDFlib pCOS as a tool for learning or debugging PDF. Here are some typical situations:
- Check incoming documents for predefined criteria
- Identify problem files in a large collection
- Create metadata summaries for document management
- quality assurance before publishing documents
- document retrieval and repository workflows
- summarize the bookmarks
- extract components of PDF documents, e.g. ICC profiles
The pCOS retrieval interface is included in other PDFlib GmbH products: if you use PDFlib+PDI, PDFlib Personalization Server, TET or PLOP/PLOP DS you also have access to the pCOS interface. If you need access to text or images on the page use our product PDFlib TET for PDF content extraction.
The pCOS Cookbook is a collection of programming examples which demonstrate the use of pCOS for various PDF retrieval tasks. The Cookbook includes sample code, input documents and sample output.
Producer: PDFlib GmbH