Using A Change Stream That Is Not Completely Ordered In A Replication System That Depends On Receiving Changes In Chronological Order.
Publication Date: 2014-Sep-09
The IP.com Prior Art Database
Disclosed is a system applied to replication systems to provide an ordered change stream by using a log order time that represents a continuously increasing log sequence.
Page 01 of 5
Using A Change Stream That Is Not Completely Ordered In A Replication System That Depends On Receiving Changes In Chronological Order .
Within a change log, changes for a single transaction are logged in order of the source application. Changes across transactions affecting the same key are properly ordered by the source application(s) as a result of key locking, by either a transaction manager or database manager. Changes and transactions that occur in parallel (i.e. affecting distinct keys), however, may be logged in an unpredictable order by multiple source applications.
Consider, for example, source application A creating transaction UORa consisting of changes to keys 1, 3, 5, 7, and 9, while another source application B creates transaction UORb consisting of changes to keys 2, 4, 6, and 8. Both transactions are in-flight (i.e. uncommitted pieces of work) at the same time and, because each affects different key sets, the transactions are autonomous with respect to each other. UORa may commit first in the transaction manager, but there could be a slight delay writing the commit log record that allows UORb to commit and write its commit log record first. The log then shows that UORb is followed by UORa, although the timestamps for the commits indicate UORa committed first.
Typical replication systems require change data to be received in a specific, predictable order to support multiple types of processing. Some examples of these systems include (but are not limited to) accurate replication to target data store, dependency analysis, restart, and cache pruning. A regression of time in the change log (e.g., processing UORb and then UORa) may cause the replication system to make incorrect decisions because the replication system is expecting changes to be read from the log in order.
In reality, UORa and UORb occur nearly simultaneously and because there is no coordination between the source applications, there is little significance to the actual times in the log records. Attempting to create chronological order for these records becomes increasingly complex as concurrent access to the data increases in data sharing environments where multiple source applications are updating the same data sources. Without expensive source application coordination with respect to creating and logging change records, it is expected that changes and transactions that do not affect the same keys may be written out of order. Coordination of source applications creating log records may not be possible, or is at least untenable with respect to source application throughput.
Given there is little significance to the actual times in the log records, serializing log records or sorting the stream ultimately give little benefit other than to allow the replication system to process the data. Using a log reader to completely sort a stream, creating a log order based on physical position in the log, and acquire log locks via the source application is difficult...