Lossy message queueing
Publication Date: 2016-May-12
The IP.com Prior Art Database
A means of distributing messages efficiently to messaging endpoints using fuzzy hashcode caching
Page 01 of 3
Lossy message queueing
Messaging systems are typically used to transfer data across heterogeneous networks and operating systems, guaranteeing that data reaches its destination intact and possibly converted into a format which the destination system understands.
Many techniques have been invented to improve the throughput of messaging systems and reduce latency in a messaging system, but in general these techniques assume that it is imperative for data to reach its destination completely or not at all.
Some of these techniques combine the communications handshaking process with the message data itself to prevent the sending application having to
wait for a completely setup communications channel before data transfer can
commence. Other techniques compress the data at the sending side of the application before transmitting the encrypted data. However, the result is that exactly the same data reaches the destination that was transmitted by the sender.
A lossy message queue (or topic) is described that allows for the caching and transmission of fuzzy hash codes instead of the messages themselves, to increase message throughput.
2 techniques are combined to implement the invention: a fuzzy hashing algorithm employed at both ends of the messaging system (i.e. at the sender and the receiver); a hashcode & message cache at both ends of the system.
The invention works in the following way: - On having a new message to send, sender decides if a message to be sent is similar (i.e. has the same fuzzy hash code) as a message that has previously been sent
- In the case that the cache contains a fuzzy-hash hit, the sender transmits the hash code to the destination which retrieves a cached copy of any previous message with the same hash code
- If the hash code was sent rather than the full message, the sender now discards the full message.
- If the full message had to be sent because of a cache miss, the message and its fuzzy hash code are added to the cache.
The resulting system offers a trade-off between message throughput and message accuracy. If the fuzzy hash code is "very fuzzy", such that it creates the same hash code for wildly different data, much of the data in transmitted messages
will be lost. If the fuzzy hash code is not "fuzzy enough", hits in the cache at the sender will rarely or never happen, and the full message will be sent in the majority of cases, yielding no benefit in throughput.
The required balance between these two extremes can be chosen based on configuration of the hashing algorithm.
For a lot of data this technique is not useful. It causes data to effectively be lost in the messaging system, and is therefore not appropriate for anything that requires data to be t...