Cooperative e-mail classification based on selective in-band notification within pattern based contextized groups
Original Publication Date: 2005-Oct-06
Included in the Prior Art Database: 2005-Oct-06
We have made two major additions to the original idea to address the implementation aspect of the idea and also enhance its practical value. Basically this disclosure is about a system for collaborative e-mail classification and spam classification is contained within the generic e-mail ranking function of this system.
Cooperative e-mail classification based on selective in -band notification within pattern based contextized groups
Spam e-mail is considered to be one of the biggest hurdles towards productivity. As it is well-known, the amount of spam e-mail will soon surpasses the amount of legitimate e-mail that is being passed around. According to a Ferris Research white paper, such unsolicited commercial e-mail makes up 30% of all e-mail exchanged today. The same paper points out the issues with the mobile messaging market becoming a breeding ground for spam and the amount of time, effort and money corporations spend on maintaining their spam defenses.
A number of systems are in place to control spam. Some of these systems are server-based and try to filter spam based on blacklists while allowing the ones listed on global whitelists. Other approaches use different filtering mechanisms. Very few existing solutions work on the user's side - many of these allow the user to specify what is spam and what is not spam.
Here are some issues with existing mechanisms to control and remove spam, Existing approaches are centralized and classify spam at a global level without regard to individual user's perceptions and analyses. When user perceptions are taken into account they take a global effect. A centralized system (particularly one that takes user perceptions into account) poses the problems of scalability and there is a need to automate the process of classifying spam by allowing the system of users to perform the process implicitly in a collaborative fashion. The problem is worse in mobile messaging markets where the intermittent connectivity means that spam updates are not guaranteed to reach all users and users may not be able to update the centralized black lists in time. In existing centralized approached, the time to update the blacklists or filtering knowledge is higher than just the network latency because of the time required to determine the information that will then be pushed onto the actual centralized classifiers.
Centralized approaches that allow for user feedback have a built in latency while they wait for
more indications of the same problem with the inherent problem of starvation - a user
notification may never be useful until there are more; a situation that worsens with the size of the
organization because most thresholds are relative. High effectiveness leads to higher false
positive rates. Corporate organizations will accept only a false positive rate between 0.001% to
0.01% (Ferris Research White Paper). False positives may become more of a problem if there is no agreement between e-mail users in an enterprise on what is spam and what is not. Finally, existing spam mechanisms use different approaches (genetic signatures, word based filtering, rule based filtering etc.) and do not interoperate requiring deployment of a single or more mechanisms on a enterprise wide basis restricting user choice and reducing the effectiveness t...