Collaborative spam classification with selective and controlled in-band notification
Original Publication Date: 2005-Sep-02
Included in the Prior Art Database: 2005-Sep-02
Spam e-mail is considered to be one of the biggest hurdles towards productivity. As it is well-known, the amount of spam e-mail will soon surpasses the amount of legitimate e-mail that is being passed around. A number of systems are in place to control spam. Some of these systems are server-based and try to filter spam based on blacklists while allowing the ones listed on global whitelists. Other approaches use different filtering mechanisms. Very few existing solutions work on the user?s side ? many of these allow the user to specify what is spam and what is not spam. Our solution to address these problems relies on a peer-to-peer selective controlled notification mechanism that monitors individual user classification (either done manually or through a local spam control mechanism) and selectively notifies other peers based on a trust level.
Collaborative spam classification with selective and controlled in -band notification
While the existing approaches have proven effective to a certain degree, they rely on centralized data collection and analysis mechanisms to classify spam and often lead to user frustration. Sometimes the productivity hits may be greater because the user is spending more time declassifying what has been classified as spam. The so called black lists are global in nature and may end up causing more harm because what is spam to someone in the development division for example may not be spam to someone in marketing. While it is possible to maintain separate blacklists for different groups of users it is not viable to maintain all these lists and the issue of scalability arises. If a hierarchical model for the black lists in considered, the top level filters slowly lose their effectiveness as more filters are added underneath them.
Our solution to address these problems relies on a peer-to-peer selective controlled notification mechanism that monitors individual user classification (either done manually or through a local spam control mechanism) and selectively notifies other peers based on a trust level that is derived through various heuristics including,
- the amount of legitimate e-mails exchanged between the peers, - the similarities in the patterns of messages considered legitimate by the peers and - the combined presence of the peers in legitimate in-house mailing lists and carbon copy (cc) lists.
- the number of overrides made by a peer on actions taken by that peer's monitor due to the recommendations of the other peer
The notifications are performed using an in-band mechanism that uses control e-mail messages avoiding the need for a separate protocol between the monitors that are monitoring individual user mailboxes. The use of control e-mails for notifications alleviates a number of issues - since the sending users are all part of the organization (or in general trusted users), these control e-mails will not be blocked by any filtering mechanism, there are no firewall issues to consider, there is no need for additional messaging infrastructure and finally no reliability concerns. Control e-mails are consumed by the monitors as soon as they arrive and in almost all cases invisible to the user - a monitor will read the control e-mail, process it (spam notification or confidence level update) and delete it immediately.
Tsend = 10
Spam notification msg
Intelligent propagation of spam notifications
For propagating a spam notification we use a methodology that aims to avoid a flood of spam notifications which not only will choke the e-mail system but will also destroy the effectiveness of the mechanism because if all notifications ultimately gets to everyone else everything will end up as spam to everyone. Our mechanism also aims to keep the groupings distinct - a single user A who is part of a group might receive notificati...