Browse Prior Art Database

Recovering from a batch processing failure when processing data which has been priority scored

IP.com Disclosure Number: IPCOM000242000D
Publication Date: 2015-Jun-12
Document File: 2 page(s) / 40K

Publishing Venue

The IP.com Prior Art Database

Abstract

When processing repeating individual messages in a batch you can score each message in the batch and create a priority order for the processing of the individual messages in this batch. Each batch needs to be processed under a transaction sync-point. If processing is interrupted (for example by a server crash) conventional systems would need to reprocess the entire batch. This batch could be large and so this reprocessing could be expensive in time. This article describes a mechanism to overcome this problem.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 01 of 2

Recovering from a batch processing failure when processing data which has been priority scored

The reason for prioritising of messages may be due to fraud detection or even to handle high value transactions first for example. In the case of insurance quoting the time when the priority was set is also important as priorities and scoring algorithms and models could change over time and the need here would be to use the algorithm which was in place at the originally scoring time.

    There are other similar solutions here using statistics but these do not handle the batch message processing case.

Time series analysis: http://en.wikipedia.org/wiki/Time_series

    
Trend analysis (such as seasonality): http://www.statcan.gc.ca/pub/12-539-x/2009001/seasonal-saisonnal-eng.htm

    Invoking specific versions of a service: http://help.adobe.com/en_US/livecycle/9.0/programLC/help/index.htm?content=0013 79.html
If during processing current element in the processing (eg 105th out of 200)

is stored and the exact priority order is known, when the system came back up the

batch can be re-scored at a specific time and the first 104 messages can be discarded so the batch process can be continued from the 105th message onwards.

    For this system to work the scoring needs to be consistent at a known time. By storing the timestamp when the scoring call was made for the batch this can be used by analytics software so that it can use the scoring mechanism/ state which

was in place at that exact moment in time. Creating a lookup table between state and timestamp could be used to link these so the only addition to a score call would also be a timestamp which initially would be the current timestamp but if the batch

was to be recreated this would be the timestamp stored at the time the scoring was

asked for.

Advantages:

    Only small amount of data required to be stored locally (index of the element being currently processed and the timestamp when the initial scoring was made) and the back end system will be tuned to handle the scoring. With traditional syncpointing the entire state of the transaction would need to be stored (ordering of the messages and which messages had been processed, for example) which is

expensive and would only ever be used in the case of a system crash so rarely required

No need to duplicate the batch message locally.

G...