The Knowledge Map: How to Take Advantage of XMP Metadata

By Manuel Brunner, Head of Projects and Services, Intrafind AG

Adobe Systems Inc. is using the slogan “Adding intelligence to media” for their XMP standard. The question which will be discussed in this article is: “How can the metadata in XMP be reused in an easy way, and how can a corporation take advantage of this information?”

The first and most obvious way to use XMP data is in an information management system. This was also the initial idea of Voith Group when they began a project to replace their host systems via conversion to PDF/A, and save the PDF/A files in a document management system.

There was much emphasis placed on the urgent and contemporary removal of old host systems which had been spread over various sites of the Voith group.

In the course of the removal, all data and information stored on those host systems should be safely and cost-efficiently archived on new systems. Furthermore, the Voith employees should be able to intuitively access the information afterwards by using a new unique search solution. The introduction of the new solution also was intended to reduce the high running costs for the maintenance of the host systems.

After an evaluation of various document management system providers, the Voith IT Solutions GmbH & Co. KG in St. Pölten (Austria), the responsible Competence Center for archiving standards within the Voith Group, decided for a totally different approach to presenting the host system content to their engineers. They decided for an enterprise search product, the iFinder from IntraFind AG.

But why use search technology instead of a DMS or ECM system for this task?

The answer is quite simple. The main goal of the host system replacement project was to make the data accessible and to find information in an easy and intuitive way, and not to manage the information. After the responsible managers of the IT Competence Center for archiving in St. Pölten attended at an international SharePoint Convention where the “Knowledge Map” idea was presented by Intrafind, a mind change started and the whole project deliveries were challenged again.

A simple way to get manageable content out of the host systems – use XMP metadata. 

In the initial project, Voith and IntraFind indexed existing archive data which was stored on an old host system in St. Pölten and exported the required documents to PDF/A format. In addition to the full-text, the metadata of the documents, which is very important for the work of the Voith engineers, was extracted by a PDF/A conversion tool into XMP metadata. Then the information from the full-text and the XMP metadata of the PDF documents (e.g. order ID, customer ID, machine information etc.) was added to the search index of the iFinder.

In cooperation with Voith, IntraFind developed a user interface which significantly improves the search process. At a glance the user has access to all relevant information in terms of metadata (e.g. author, file format, creation date of a document, project-specific metadata fields). The result list can be quickly filtered by mouse click and can be visualised in terms of a so called “Knowledge Map”. This way, the user gets a 360 degree view of “his enterprise knowledge” which is based on his individual user rights and can be intuitively used without any training.

Search like in an online shop

For the visualisation of the Knowledge Map, IntraFind designed a user interface that the user is already familiar with from the Internet. It can be compared to the navigation of leading modern online shops. In practice, this means that every Voith employee can limit the enormous amount of enterprise data via setting filters by mouse click until a manageable number of results is displayed. Then the search is activated and a list is created containing only the remaining results based on the filter criteria.

The following example demonstrates how the Knowledge Map works in practice:

A Voith engineer searches for information about a certain customer from the year 1994 and therefore clicks the customer’s name in the Knowledge Map. With a second mouse click he limits the data set only to information from the year 1994. After that he gets a result list which contains all information about relevant projects affecting this customer, about items used in the course of the customer projects and even about the coat of paint of the individual components of the produced machines. With another mouse click he can then limit the data set to less than ten hits and gets the result list in a classical view which he is used to from the common Internet search engines or in list view with option to transfer these data into an excel sheet for further proceeding.

To accomplish this, it is important that the classical search entry field is combined with the functionality and possibilities of visualisation provided by the Knowledge Map.

The basic concept of the Knowledge Map: Find without search

As an alternative to the exclusive selection of filters, the Voith employee can also enter a term in the search entry field of the full-text search and combine this search with the filter elements of the Knowledge Map. Later on, he can remove each selected metadata from his search (e.g. the restriction to a certain period of time or author) and consequently enlarge the hit set again.

The difference with this solution in comparison to a classical database application is clearly obvious: by using the possibility to set filters, the error rate of the search process (e.g. typos of the user when entering the search term) is explicitly minimized. Furthermore, the Knowledge Map can be quickly implemented, is easy to manage and very high-performance.

The concept behind the Knowledge Map consists of the quick retrieval of already known information as well as in the possibility of getting a quick overview of all existing enterprise information about projects or products via just one aggregated access point. This supports users browsing through enterprise data.

All companies which already had the Knowledge Map in use had basically the same requirements for their new search solution:

One search for the entire enterprise knowledge as well as search windows in their application or a dedicated set of documents.

An intuitive handling via guided search which does not require any previous knowledge about how to use complex Boolean operators. Furthermore, the different search technologies in databases, intranet or document management systems should be harmonised.

“Without the need to permanently reinvent the wheel”: new colleagues should also be able to get quick and easy access to existing (old) information in order to avoid a repetition of previously made experiences (which are often very painful for the company) in their current projects.

Positive balance of the search project

The vision of Dipl. Ing. Erich Seher, managing director of the Voith IT Solutions GmbH Co. KG in St. Pölten to introduce an overall enterprise solution evoked very positive feedback among the Voith employees. Consequently, the initial project has been extended to the indexation of further host systems (extracted metadata + PDF documents), SharePoint data and file server data now. For the next step it is planned to index mailboxes and to connect SAP data and documents.

The benefit of the new solution is enormous, especially because the existing structural information is always integrated into the search.

During a search in the file system, for example, the folder where the hit documents are stored in is always displayed for fast filtering of huge result sets. Via browsing, like in the Windows Explorer, the user can quickly and easily find the relevant hit document and open it. This functionality complies with the search behaviour of the users who often remember that they have saved the required document to a certain folder or that it has been created by a certain author, but are unable to find it again without the help of an intelligent search.

In addition to the positive acceptance by the Voith employees, the innovative search via intelligent navigation is also financially a great success: in less than six months after shutdown of the old host systems the ROI in terms of the investment costs of the new overall system will be achieved.

This use case shows how to use essential information in a corporation, the XMP metadata, for a retrieval engine with enormous high usability and how to best meet customer, financial and legal requirements by using PDF/A documents.

About PDF/A Competence Center

The first of the PDF Association's Competence Centers.

Leave a Reply