It was necessary to consolidate several archives from a number of sources, using PDF/A as a unifying storage format. The individual archives contained a number of different types of data: customer records, invoices, credit and delivery notes, personnel records and executive documents. For the latter type, it was important to observe special access rights during data conversion. Together with a partner which specialises in reading diverse archives, we took on this task. LuraTech’s part of the job was the mass conversion of all read source formats into fully text-searchable PDF/A files. In introducing the solution, new entries were to be captured and converted in colour.
Our partner provided the files as they were read in a file system structure. The index data was provided as CSV or XML files, depending on the archive it came from. A separate conversion job was set up for each archive type, and special access rights were assigned only to selected employees to handle the executive archive.
The customer was particularly interested in logging the entire process of conversion and the accompanying completeness checks. This meant generating log data to compare against the source archive’s database entries, to ensure data migration was completed in full.
The individual conversion jobs were handled using our own product, DocYard. For each source archive, the documents to be processed were read in along with the CSV or XML files and converted to searchable PDF/A files. Using the XSL transformation module, the index data for each document was then prepared for automatic import to the destination archive, and exported along with the documents themselves. DocYard recorded each step of conversion, and the automatically-generated reports allowed our partner to easily compare against the source archive for completeness checking.
After final random quality control and release to the end customer, the temporary data was deleted. The executive archive’s extra security needs were maintained using DocYard’s rights management and client separation tools.
After a two-week-long preparatory phase, the implemented solution was handed over to the partner. The complete solution was then put through a week-long stress test and, following minor adjustments, began conversion after another week. The jobs carried out by DocYard proved to be both high-performance and very stable. Support for the application was provided through remote access.
The entire project was finished within three months, with the capture lines also converted to process in colour. Today, the customer saves all incoming documents in colour as fully text-searchable PDF/A files. Storage space required for the old data was reduced from around 2TB to around 1.4TB. The full-colour and now fully text-searchable new entries are no larger than the old black/white TIFFs, thanks to LuraTech’s award-winning mixed raster content (MRC) compression technology.
Contact us for more information about our services!