Browse Prior Art Database

Method For Determining Where to Stop Reading In A Semi-sorted Stream of Data With Multiple Writers Writing At Different Intervals

IP.com Disclosure Number: IPCOM000191253D
Original Publication Date: 2009-Dec-23
Included in the Prior Art Database: 2009-Dec-23
Document File: 3 page(s) / 44K

Publishing Venue

IBM

Abstract

In a contiguous stream of data made up of blocks data which contain individual time stamped records a reliable end point needs to be found in order to avoid reading data past the point which is useful for a defined query. The blocks in this stream may come from multiple sources which will buffer the records for some amount of time. This can lead to the blocks being out of order with respect to the records they hold. So hitting a block which contains records from after the end point defined by the user may not be the point where reading should stop as records may exist in blocks past this one from other systems. Due to this it is required that the stream is read until the end to determine that all data was captured for which the query required. This is problematic in the case where this is a significant amount of data after the user specified end point as it would lead to much time being wasted.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 73% of the total text.

Page 1 of 3

Method For Determining Where to Stop Reading In A Semi -sorted Stream of Data With Multiple Writers Writing At Different Intervals

     Each system involved in writing blocks of data to the stream will need to write out buffered data at some regular interval x, where x is some number of minutes. This is the greatest amount of time blocks can remain locally buffered before being written, blocks can be written at a rate faster than 1 block per x minutes. The system to write stagnant blocks out to the stream would collect records into a block then check every x/2 minutes mark the first block in the chain with an indicator. Upon being filled, a block would be written to the stream and would be taken off the chain and the indicator reset. If the next x/2 interval happens and the block is the same (indicator set) the it will be forcibly written to the stream. This will lead to a stream which looks like the following:

Block #1

+----------------------+

| Image Name |

| Write Time |

| |

| Rec #1 |

| +- - - - - - - - - + |

| [ Write Time ] |

| [ ] |

| +- - - - - - - - - + |

| Rec #2 |

| +- - - - - - - - - + |

| [ Write Time ] |

| [ ] |

| +- - - - - - - - - + |

| |

| ... |

| ... |

| |

| Rec #i |

| +- - - - - - - - - + |

| [ Write Time ] |

| [ ] |

| +- - - - - - - - - + |

| |

+----------------------+

....

....

Block #n

+----------------------+

| Image Name |

1

Page 2 of 3

| Write Time |

| |

| Rec #1 |

| +- - - - - - - - - + |

| [ Write Time ] |

| [ ] |

| +- - - - - - - - - + |

| Rec #2 |

| +- - - - - - - - - + |

| [ Write Time ] |

| [ ] |

| +- - - - - - - - - + |

| |

| ... |

| ... |

| |

| Rec #k |

| +- - - - - - - - - + |

| [ Write Time ] |

| [ ] |

| +- - - - - - - - - + |

| |

+----------------------+

     Each block has a time stamp when it was written to the stream, and each record in the block has a time stamp as to when it was logically written (buffered).

     When the reading of the stream takes places, each block will contain records from a single image participating in the stream. An end time is specified as to when we should discontinue matching records that we read. As blocks are processed, the image which they originated from will be added to a table. This table will contain the image indicator and a flag if we have reached a block which is past the end point for the requested end time:

Image Indicator Done Reading

+-----------------------+--------------...