Browse Prior Art Database

Method and System for Restricting User Input in User Generated Content (UGC) Environment

IP.com Disclosure Number: IPCOM000201723D
Publication Date: 2010-Nov-19
Document File: 3 page(s) / 57K

Publishing Venue

The IP.com Prior Art Database

Related People

Dharmendra Adsule: INVENTOR [+2]

Abstract

A method and system is provided to restrict user input in user generated content (UGC) environment to a predefined set of whitelisted words. The predefined set of whitelisted words is constantly evolved using stemming, spell checkers, and a machine learning algorithm.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 52% of the total text.

Method and System for Restricting User Input in User Generated Content (UGC) Environment

Abstract

A method and system is provided to restrict user input in user generated content (UGC) environment to a predefined set of whitelisted words.  The predefined set of whitelisted words is constantly evolved using stemming, spell checkers, and a machine learning algorithm.

Description

Disclosed is a method and system for restricting user input in user generated content (UGC) environment to a predefined set of whitelisted words.

In an instance, the user input is received and a lookup is performed by stemming the words in the user input using a stemming algorithm.  For example, the words in the user input may be stemmed using "Porter Stemmer" algorithm.  The stemming algorithm improves accuracy and reduces word count in a dictionary of stemmed words.  For example, words such as "playing" and "played" are stemmed to "play", hence easing lookup and improving efficiency.  The stemmed words are then compared to the predefined set of whitelisted words.  The stemmed words that appear in the predefined set of whitelisted words may be identified as clean words.

In other instances, machine learning algorithms and spell checkers use the words in the user input to enrich the dictionary of stemmed words.  The machine learning algorithm is used for constantly evolving the predefined set of whitelisted words.  Also, a spell checker, for example "GNU Aspell" may contain a list of "suggested words".  The stemmed w...