Facebook
Twitter
YOUTUBE
LINKEDIN
XING
Datalogics
Status: Partner Member
Country: US
Sector: All industries
Contact:
Joined at: Feb 08
Website: http://www.datalogics.com/

Linked User
Maryanne Pavlin
Matt Kuznicki
Nicki Bullock
Vel Genov
Emma Kaschke
Leonard Ho

PDF Alchemist



Recover editable text from PDFs

  • Intelligently display contents of PDFs on tablets and small-screen displays
  • Reconstruct source files
  • Improve searching and indexing of PDFs within document repositories

alchemist_hero
Easily convert PDFs to HTML

Datalogics PDF Alchemist is a new (C/C++) SDK for intelligently extracting text and images from PDFs and exporting to HTML 5 or EPUB. It employs sophisticated techniques to identify and reconstruct “text flows” within the PDF. These text flows are often lost in PDFs, and yet are vital for repurposing the information locked within the PDF.

Features

  • Converts columns and pages back into single continuous text flow
  • Discards “page artifacts” such as running headers and footer
  • Output in HTML5 or in EPUB format
  • Font size and style detection
  • Text justification and indentation detection
  • Text flow margin detection
  • List detection and conversion into real HTML
  • Table detection and conversion into real HTML
  • Converts PDF bookmarks into clickable navigation links
  • Detection of internal and external URL links
  • Includes DLL/ shared library and command line programs for integration into products and server workflows
  • Free, fully-functioning evaluation versions available

More information: http://www.datalogics.com/products/pdf/pdfalchemist/

Location
101 N Upper Wacker Dr, Chicago, IL 60606, USA



Related Products
Adobe PDF Library


The Adobe PDF Library SDK is a low-level PDF library that contains a powerful set of native C/C++ APIs with interfaces for .NET and Java APIs. Systems integrators, independent software vendors (ISVs), enterprise IT developers, and others can integrate Adobe PDF functionality within custom applications in a client and / or server environment.

PDF Java Toolkit


Datalogics PDF Java Toolkit is a native Java library that provides high-level APIs for automating PDF workflows like processing PDF forms, verifying digital signatures, and extracting text. It also offers low-level APIs for working directly with the structure of the PDF for those times you need it.

Adobe PDF Converter


Adobe Normalizer, is an API which allows developers to quickly and easily convert Encapsulated PostScript (EPS) and PostScript (PS) files to Adobe’s Portable Document Format (PDF). The industry-standard Adobe Distiller and Distiller Server are themselves built upon PDF Converter SDK; and now this API is available separately to application developers.

Adobe PDF Print Engine


The Adobe PDF Print Engine is a common rendering engine technology, packaged as a software development kit (SDK). It can be the basis for a variety of products for previewing and printing Adobe Portable Document Format (PDF) documents at different stages of the professional print workflow.

PDF2IMG


Datalogics PDF2IMG is a command-line utility that converts PDF files to a variety of image formats including PNG, JPG, TIFF, BMP, and more. It is built upon the Adobe PDF Library and uses Adobe technology for unrivaled color management during the PDF conversion process

PDF Alchemist


Datalogics PDF Alchemist is a new (C/C++) SDK for intelligently extracting text and images from PDFs and exporting to HTML 5 or EPUB. It employs sophisticated techniques to identify and reconstruct “text flows” within the PDF.