3-Heights™ PDF Extract is a component for reading out the contents and properties of PDF documents.
PDF documents are used to store important information relating to products, customer data and corporate knowledge. Meta information such as the document’s creator, date of creation or date of modification are further integral parts of a PDF document. PDF documents are often used as “containers” to enable the transfer of text, images, videos and other data to other processes independently of the platforms in use.
This component can extract information quickly and efficiently, regardless of whether document content or document properties. The results can be stored in a database, for instance, or used for evaluations and statistics or to secure internal corporate knowledge.
Areas of Use
Incoming Mail and Document Processing:
Content from PDF files such as forms or scanned incoming invoices, for instance, is extracted and processed for characterization or indexing.
PDF documents are restructured in preparation for use by other target groups. The process reads out processing information such as barcodes, address information or page formats that can then be used for controlling printing and packaging lines or sorting processes.
Texts or their components are extracted for separate storage in metadata. This allows document indexing to be extended as required.
Other Areas of Use:
- Convert PDF documents into text documents
- Extract information such as addresses, invoice data and report data from documents for process control purposes
- Extract information for document classification and document indexing
- Process data in forms
- Extract images for further processing (scans, photos, etc.)
- Analyze and evaluate the content of PDF documents in mass processing
Producer: PDF Tools AG