+91-40-66547000 | US Toll Free no :8775424996 | India Toll Free no: 18002660132
Enquiry form

MIDAS / Data Quality Management
Data Quality Management: Measure what you can manage
Data quality management initiatives focus on the process of ensuring that an organization’s data assets are of sufficient quality to meet its needs and objectives. It is well known that what you can measure, you manage. data quality is no different, so first you must define what good Data is, then measure, analyze, improve and control it so that you know where you are on the Data asset improvement journey at any time.
Data has no value unless it can be used to make sound corporate decisions. Making decisions based upon bad data leads to bad decisions being made with a high degree of certainty! Invalid information, so-called "dirty data," increasingly populates databases and operational history files. Reliance on unrecognized, often erroneous, Data points to make business decisions compromises the integrity of those decisions. data quality often defines the success or failure of CRM implementations. Gartner recommends that companies engage in data cleansing projects in conjunction with business intelligence adoption (refer Gartner: Insurers Must Invest in Tech to Meet Coming Trends by Pat Speer February 6, 2008).
Recognizing the importance of data quality and the need to evolve standards for codification in the entire supply chain system, the International Organization for Standardization, ISO, has come out with two standards – ISO 22745 that covers the tools for encoding Data and ISO 8000 for information quality in terms of encoding, completeness, origination and accuracy. Through a memorandum of agreement signed with Electronic Commerce Code Management Association (ECCMA) in October 2004, the NATO Allied Committee 135 (AC/135) has promoted the NATO Codification System as an international standard. The ECCMA Open Technical Dictionary (eOTD) is an industrial version of the Military NATO Codification System (NCS). Along with the associated XML interchange formats, a vendor can build master Data that meets ISO 8000 Data quality standards. These codes can be used in the entire Data life cycle management, from design to disposal, and allows for seamless date exchange between producers, distributors, customers and service personnel, as shown in the diagram below.
Source - Defence Logistics Information Service, Battle Creek, MI
Data standards problems can occur in many areas, ranging from invalid mailing addresses to improperly formatted data, such as a manufacturer part number that omits either a crucial prefix or suffix. As shown in the diagram below, data anomalies can arise from various reasons.

The above anomalies can be grouped into three main areas of interest with respect to the quality of the data:
Completeness -
does the organization have Data assets that are incomplete or missing? Eg: do all customers have addresses?Accuracy
– Is the organizations’ Data assets sufficiently accurate to meet internal (business processes, decision making) and /or external (regulatory, third parties) requirements?Integrity
– Are the organization’s Data assets consistent across the enterprise? – For ex, does the list of suppliers in a company’s ERP system match those in the finance application? Do the relationships between different Data assets make sense? Are duplicates removed from the system?Methodology
The general methodology in moving to a Data quality standard across the enterprise involves the following stages
- Data Profiling - process of examining the Data available in an existing Data source (e.g. a Database or a file) and collecting statistics and information about that Data and includes column profiling, dependency profiling and redundancy profiling. "In Data quality profiling, you identify what your defects are, and how your Data compares against your business rules," says Frank Dravis, vice president of information quality at FirstLogic.
- Metadata Analysis – understand the Data, extract and organize them from any source within the organization
- Outlier Detection – detect Data values requiring further investigation
- Data Validation – define Data types and constraints on Data
- Pattern Analysis – analyze for correct data formats
- Relationship Discovery – discover relationships eg: primary - foreign key constraints
- Statistical Analysis – perform statistical analysis, min-max values
- Business Rule Validation – perform domain checking, range checking, look up validations etc
- Data Quality / Enrichment – cleanse, standardize and categorize
- Data Integration – integrate data from disparate sources within an enterprise. This may involve data transformations from various sources into the target application
- Data Monitoring – review data periodically to take corrective action

