Workbenches, Lenses and Decisions, Oh My! A Data Quality Software Assessment

Introduction

In a recent survey conducted by The Information Difference, the top three domains requiring data quality initiatives included the product domain, the financial domain, and the name/address domain.  This was surprising since most of the data quality vendors offer name and address matching features; however, few offer product specific features and even fewer offer a financial based set of features. The survey included twenty-seven questions that ranged from a ranking of organizational data quality estimates to data quality implementation specifics.   The survey contains thorough analysis spanning the data quality paradigm. One of the more telling questions in the survey was in reference to the vendor/tool selected by the organizations implementing data quality solutions.

After reading the response summary, it was clear that there was not a predominant choice.  As the survey points out, this could be a consequence of the rather large number of data quality tools available on the market. With so many data quality options, could it be that the data quality market has become so saturated that the difference between offerings has become obscured?

With this in mind, I have put together an assessment that analyzes how the features of two leading vendor offerings, Informatica and Oracle, address data quality issues in the enterprise.  The specific products involved are Informatica’s Data Quality Workbench and Oracle’s DataLens®.  While this assessment is limited in scope, it does correlate with two of the most popular data domains; product and name/address. 

The Informatica Data Quality Workbench

The Informatica data quality product offering includes two products, Data Explorer and Data Quality Workbench; however, for the purposes of this assessment, only the Data Quality Workbench will be reviewed.  

The reason for this is that Data Explorer is primarily a profiling tool which provides insight to what data requires attention, whereas Data Quality Workbench is the tool that performs many of the quality enhancements. 

The Data Quality Workbench contains many features that enable the data quality analyst to enrich data; however, chief among these are the address validation and matching components.  

The address validation component utilizes a service provided by AddressDoctor®, a leader in global address validation.  This service validates addresses fed into the component in multiple ways such as street level, geocoding, and delivery point validation via their reference database which currently covers 240 countries and territories.  As a result, non-deliverable addresses are verified or corrected, increasing the success of operational initiatives such as sales, marketing, and customer service.  

 In addition to the address components, there are also match components available designed to compare various types of strings such as numeric, character and variable character based. 

The tool generates a score representing the degree that the two strings are similar.  The higher the match score, the greater likelihood that the two strings are a match. Potential matches are grouped enabling manual or automated evaluation in the nomination of a master transaction.
 
  

Oracle DataLens®

Formerly from Silver Creek Systems, DataLens® is a data quality engine built specifically for product integration and master data management.  Using semantic technology, DataLens is able to identify and correct errant product descriptions regardless of how the information is presented. This distinguishes it from most data cleansing products.

Based on specific contexts, such as manufacturing or pharmaceutical, DataLens® can recognize the meaning of values regardless of word order, spelling deviations or punctuation. DataLens® also enables on-the-fly classifications, such as Federal Supply Class, and language conversion abilities from any language to any language.
  

Oracle’s long term vision for DataLens® is a seamless integration with Oracle’s Product Management Hub which will allows organizations to centralize the management of product information from various sources.  This collaborative relationship will allow organizations to evaluate and, if necessary, standardize product descriptions as part of an enterprise data management and migration effort.
  

The Assessment

Now that we’ve covered the basics of these products, what conclusions can we draw?  Considering the native technologies built into each of these products, it is reasonable to conclude that there is little overlap between the two.  While both these products are excellent data quality tools, they are meant to address two distinct data quality domains.

With its address validation technology, Data Quality Workbench is primed for customer data integration (CDI), while DataLens’ imminent integration with Oracle’s Product Management Hub makes it a compelling choice for product information management (PIM).

Customer Data Integration (CDI)

CDI benefits organizations both large and small by enabling a “single view of the customer” and typically relies on name and address coupling in order to identify potential duplicate customer data. CDI is often associated with direct marketing campaigns, but also provide benefits in billing operations and customer service operations.

Informatica’s Data Quality Workbench is an appropriate selection for an organization looking to achieve any of the following objectives:

  1. Eliminate direct marketing mailings to undeliverable addresses
  2. Eliminate multiple direct marketing mailings to the same customer
  3. Eliminate multiple direct marketing mailings to the same household
  4. Eliminate erroneous billing activities due to customer/client duplication
  5. Eliminate erroneous billing activities due to undeliverable addresses
  6. Increase customer satisfaction by eliminating confusion caused by duplicate customer data
  7. Decrease resolution time for customer service incidents by eliminating duplicate customer data

Product Information Management (PIM)

PIM initiatives benefit organizations with multiple product lines and distributed order fulfillment operations.  They are frequently associated with supply chain operations in an effort to reduce product data variability and stream-line product order fulfillment. PIM projects are rooted in data governance and rely on external reference data and business process vigor to implement.

Oracle’s DataLens is an appropriate selection for an organization looking to achieve any of the following objectives:

  1. Eliminate erroneous order fulfillment activities caused by stale or variant product information
  2. Eliminate incorrect billing due to discrepencies in product data
  3. Eliminate under utilization of warehouse inventory due to confusion on availability of product
  4. Eliminate confusion and delays at customs due to discrepencies in product weights and descriptions
  5. Eliminate reconciliation exercises associated with the remediation of product data
  6. Increase cross-sell for customers via aligned data on product usage
  7. Decrease errors resulting from poor data entry accuracy

Just as no two data quality projects are the same, neither are data quality software products. So while Oracle’s DataLens and Informatica’s Data Quality Workbench are both classified under the data quality software umbrella, they are so different in design and implementation that they cannot be thought of as interchangeable. Each tool enables the execution of information quality in data domains so distinct that it is important to understand this context prior to the investment of purchasing such a tool.

This further supports the need to perform an assessment of tool features aligned to the business need in the project planning phase in order to ensure full capitalization of the investment in the data quality initiative.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s