Method and System for Autonomic Utilization of Optimal Methods for Message Filtering by Expressed Interest in Publish-Subscribe Multicast Environments
Original Publication Date: 2004-Jan-23
Included in the Prior Art Database: 2004-Jan-23
In multicast messaging systems, subscribers receive many message to which they have not subscribed, There are various methods for subscribers quickly to filter out these unwanted messages; including encoding either lossless or lossy information about the target subscribers of a particular message in the message header. The best choice of method depends on the circumstances. This disclosure describes an automatic method for using the most appropriate quick filtering technique.
Method and System for Autonomic Utilization of Optimal Methods for Message Filtering by Expressed Interest in Publish -Subscribe Multicast Environments
In publish-subscribe messaging models where Multicast is the transport of choice, clients receive messages from every multicast group that they join. Generally, only a subset of the total audience has an interest in a given message. Clients must be able to quickly determine which messages they are interested in and which messages they should discard.
The importance of rapid filtering cannot be overstated. Each client could potentially replicate the work already performed by the broker; thus clients that have very minimal processing resources could easily be backlogged under heavy traffic. Moreover, the length of time required on the client to receive and then deliver a message to a consumer is impacted by the amount of time required for the client to determine if the message is relevant to it. This can result in message loss on the client if its buffers are full when messages arrive.
Two existing methods describe ways which can improve the matching process by breaking it down into two steps. One of these methods is lossless - it does not allow for false-positives in the matching process, while the other of these is fuzzy - some false positives are allowed.
In the lossless case, the list of subscribers to receive a message is encoded in the header of the message; maybe in list form, or in bitmap form, or some compressed version of either of these. Where several subscribers exist on a single machine, a first pass scans the header to see if there is ANY local interest in the message, and a second pass delivers the message to appropriate local subscribers.
In the lossy case, (as described in UK patent application 0329188.7) a lossy compression of the target subscriber list is used that may permit false positives -- this reduces header size but requires extra filtering at the recipient to eliminate false positives.
For environments where there are many unique subscriptions (or any other type of records of expressed interest), the lossless method is not as effective as the fuzzy method. But in smaller environments, the lossless method prevents any false positives. Thus, our novel solution is to adaptively switch between the two based on the number of unique records of expressed interest.
Our solution will decide at publication time what is the best way to actually deliver a message - instead of always following the same routine. This decision is based on (1) whether the broker knows the interests of its audience members (eg whether or not the broker is maintaining current subscriptions requested by the audience), (2) how large the interested set of members is, and (3) the granularity of the topic hierarchy. We discuss this in terms of a topic hierarchy, but it applies to any form of organized information aggregate. The granularity can be determined by sparse or dense topic hierarchies; i...