Browse Prior Art Database

Data Reconciliation on Demand Disclosure Number: IPCOM000240630D
Publication Date: 2015-Feb-13
Document File: 6 page(s) / 64K

Publishing Venue

The Prior Art Database


Disclosed is a system and associated methods for data reconciliation on demand, as the user retrieves pages of a paginated query of reconciled resources. The novel method and system optimize the retrieval of reconciled data by providing buffering and cache methods for queries.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 28% of the total text.

Page 01 of 6

Data Reconciliation on Demand

Data reconciliation consists of constructing a reconciled view of the same entity, or resource, based on attributes collected from different sources. Inventory and management tools are some of the tools that automatically discover the physical and application infrastructure in an enterprise. These tools collect information about specific attributes that are related to a specific domain . For example, considering a computer, while one tool may be interested in attributes related to the network topology, such as the Internet Protocol (IP) address, subnet, and Ethernet address, another tool might be interested in software related information, such as the Operating System (OS) and installed software. The attributes discovered by each one of the different tools might overlap . For example, both tools might discover the IP address of the computer. The attributes that overlap can then be used by a reporting tool to reconcile all the information and build, for example, a reconciled report of all the data collected by the different tools.

These tools may discover thousands or millions of records that need to be reconciled . The whole process of reconciliation may take hours or days for such a huge number of records. At the same time, some client applications may not be able to wait that length of time to start retrieving reconciled data, and need a way to determine the reliability or freshness of the retrieved data.

The novel contribution is a system and associated methods for data reconciliation on demand . The approach is to reconcile data as the user retrieves pages of a paginated query of reconciled resources.

Reconciliation Closure
One of the most expensive steps during data reconciliation is to compute the group of records that have identifying attributes in common, which is referred to as "closure" hereafter.

Assuming the following set of records (Example 1): RECORD1 = {S1}{M1M1S1}

RECORD2 = {M1M1S1}

RECORD3 = {M2M2S2}

RECORD4 = {S2}{M2M2S2}

RECORD5 = {M1M1S1}{MAC1}


In this scenario, there are two different closures:


Page 02 of 6

Closure 1 = {RECORD1, RECORD2, RECORD5, RECORD6} Closure 2 = {RECORD3, RECORD4}

Note that RECORD1 and RECORD6 belong to the same closure although they do not have identifying attributes in common , but are indirectly linked through RECORD5 (MAC address = MAC1).

The simplest method for computing the closure for a specific record R follows :

Closure Set: C

Working Set: W

Initial Registration Record: R

1. C ←{R}

2. W ←{R}

3. Repeat while W != {}:

3.1 W' ← {}

3.2 For each record r in W:

For each identifying set of attributes (ai) in r:

Add to W' all records r' that have the set of identifying attributes

not in W.

3.3 W ← W'

3.4 C ← C U W'

This method is expensive because it is quadratic on the number of records, and can incur in multiple selects to compute the closure of a single registration record.

Reconciliation Cache

The reconciliation cac...