Browse Prior Art Database

Use of single placeholder for multiple messages in large file transfer Disclosure Number: IPCOM000011833D
Original Publication Date: 2003-Mar-19
Included in the Prior Art Database: 2003-Mar-19
Document File: 2 page(s) / 47K

Publishing Venue



Sending large volumes of data already held persistently over a reliable transport involves two problems. (1) For efficiency, the messages must not be restored persistently by the transport (already handled by MQ 'reference messages'). (2) The data must be segmented, to prevent huge resend costs on failure (already handled by MQ message segmentation). However message segmentation must often be defined in advance by the application code; and even where reference segments are used, control data about each segment must be saved persistently. This can be expensive, and cause slow startup times, where there are many segments. This can be a particular problem when the user quickly realised the operation was in error, and wishes to abort. A lazy segmentation technique is described, where the segment control data is not realized at once, but on demand. The system stores a single block of control information that implies a set of reference control blocks; these segment reference blocks and thus the complete message segments are lazy realised as transmission occurs.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 2

Use of single placeholder for multiple messages in large file transfer

Current status:

It is sometimes required to move large volumes of data (eg large files) reliably and recoverably over a transmission protocol such as HTTPR or MQ channels. These protocols reply on persistent state at each end to assure once and once only delivery, even in the case of transmission or endpoint failures.

     Typically, the data is also staged in the persistent store used by the protocol. This can be avoided using 'reference' messages, where the protocol store holds status information about messages (eg header information), and references to the data, but NOT the data itself. The data is held persistently elsewhere in some external persistent store, for example in the sending file store, and in the receiving files store. The full message is assembled by the protocol from the protocol store and the external store as the message is transmitted.


     Where a very large file is to be transmitted, it must be sent as a series of messages. Creating and storing even the reference messages (each reference message containing a reference to a supbart of the file) can be very expensive in time and space.


     This disclosure does not create the set of reference messages all at one time. Instead, it creates a single token message that stands in for a set of reference messages, and stores this token message in the protocol store. When the token is accessed, it creates a small subset of the implied reference messages and saves them back into the store, followed by a new token message that stands for the remainder of the reference messages.


     With this solution, the store never holds (on behalf of a single large file) more than a small number of reference messages, plus one token message. This reduces the store size.

     Similarly, it is not necessary to write all the reference messages at a single time, but they are written at different times during the overall transmission process. This does not save overall cpu time, but (a) it spreads the cpu/io cost more evenly, (b) it makes it easier to parallelise the writing of reference messages with the transmission, and (c) if a transmission is aborted (eg by the user), it reduces the wasted cpu time.


     Suppose one wishes to send a file of 1,000 bytes, to be sent as 10 100 bytes segment messages (more realistic...