Browse Prior Art Database

Method of building a dynamic, learning email spam filter through heuristics Disclosure Number: IPCOM000014171D
Original Publication Date: 2002-Mar-28
Included in the Prior Art Database: 2003-Jun-19

Publishing Venue



Disclosed is a method of building a dynamic, learning email spam filter through heuristics. The filter proposed learns the behavior of the user and uses this information in a heuristic manner. The filter deduces what the user considers spam email and the spam email can be automatically deleted in the future. One of the annoying task of users of email is wading through the daily accumulation of spam in their email inbox. Spam is defined as any sort of bothersome or unsolicited email. Currently, there are some solutions to help mitigate the situation. These partial solutions are typically hardcoded filters for email. Thus, when a certain sender or subject matter is received by the user's email program, and if portions of the email contain a predetermined criteria, the email is automatically deleted. However, the problem is the people conducting the spam know that filters are out there to kill their mail. Thus, many spammers move from account to account or they hide the source of the sender or they change subject heading to fool the filters. The filter proposed in this disclosure is one in which the filter is made to learn the behavior of what the user would consider spam and use this information in a heuristic manner to deduce what mail to delete in the future. What is proposed is that when a piece of spam is received, the user can use a special delete called "SPAM DELETE". When this occurs, not only is the mail thrown away (typically unopened), but the filter notes the following: the address of the sender, keywords in the subject, and keywords in the body of the note. Hence, the learning filter reads the email for the user. As the filter receives more and more email to be learned-then-deleted, it starts building a heuristic database of mail considered to be spam. At a time to be determined by the user or program, the filter becomes active. At that point, it starts automatically intercepting email that could be bad and it either deletes or puts it into a spam container, an email folder. At the end of the week, the user goes through and "grades" the filter. The user notifies the filter if there are any pieces of legitimate email, and the filter resends this to the user and does not filter it out, but most importantly, the filter then adjusts its heuristic database to learn NOT to delete pieces of mail with that type of information. The specifics of the heuristics are not spelled out in this disclosure, but it could be a weighting of phrases or key words such that a certain weight of words tips the scales towards interception-and-deletion and another level of weights causes the email to be shunted to an intermediate folder. The key to this patent is not the heuristic itself, but the use of a learning system in email filtering. This invention frees the user from having to use hardcoded filters that often deletes valid email. Spammers love to use free email services, but so do friends. Thus, the current price to pay is to allow spam from any free email service as well as email from friends. With the use of a learning spam filter, the rigid use of a hardcoded filter is removed. Additionally, there is no final deletion as the filter has time and flexibility to let the user retrieve incorrectly classified mail from the