Browse Prior Art Database

Evaluating Remediation Costs for Insufficient Data Quality in an Information Governance Environment

IP.com Disclosure Number: IPCOM000243007D
Publication Date: 2015-Sep-08
Document File: 5 page(s) / 63K

Publishing Venue

The IP.com Prior Art Database

Abstract

With the substantial growth in data volume, velocity, and variety there comes a corresponding need to govern and manage the risk, quality, and cost of that data and provide higher confidence for its use. This article introduces an approach for evaluating costs for remediating data with insufficient data quality in an information governance environment categorized by their enablers like information governance policies or rules or disciplines like data classification.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 42% of the total text.

Page 01 of 5

Evaluating Remediation Costs for Insufficient Data Quality in an Information Governance Environment

Business Policies reflect the standards and requirements that an organization must adhere to, the functions and practices that apply those standards , and the responsibility to comply with them as well. They contain and reference the Business Rules which implement the requirements and standards defined by the policy . A hierarchy of Business Policies may be created to express the diverse and multitude of standards and requirements, where each policy imparts additional identification information and a set of contained rules. For example a Sensitive Data Protection policy contains several sub policies like Contract , Customer, Employee and Product Data Protection while e . g. Employee Data Protection is implemented by a Mask Employee Personally Identifiable information governance rule. An information governance rule is implemented by a one or more data rules. Data rules are rules defined within a data quality analysis tool which analyze a certain criteria e. g. "Cannot create an account for a person under 18."

Data classification is used to support the process of either automatically or manually assigning every column (data field) into meaningful categories (e. g. credit card, American Express credit card sub category, etc.), which can then be used to organize the inventory of columns.


1. Evaluating Remediation Costs for Insufficient Data Quality by Information

Governance policies , rules and data rules :

Data quality exceptions are remediated at a data rule level which means all exception occurrences (records) for each data rule binding output (database column, data field) have to be remediated. A remediation tool typically provides one or a set of affected data rule binding output for every exception record assigned to a specific data rule (e.g. Completeness_Check_ACCOUNT_HOLDERS.ADDRESS -> Check that the address is not null or empty in BANK.ACCOUNT_HOLDERS). A data rule can investigate on data quality on data rule bindings for one or several database tables .

Steps to calculate cost :


a. Provide cost information for either automatically or manually remediating a specific data rule binding output (column, field), e. g. store that information in a repository associated to corresponding data rule and column /field (cost can be different e .g. for correcting a ZIP might be different than correcting an age or an address ).


b. Summarize cost for all exceptions at data rule level :


Only latest data analysis runs for a data rules have to be taken into account , thus the latest run of a data rule execution record will be filtered . Now all implemented data rule bindings for this execution record of that data rule will be looked for and all exception records for these data rule bindings will be fetched . All distinct exception record IDs

joined with their corresponding data rule bindings multiplied by the defined cost in the repository prov...