Browse Prior Art Database

Method and System for Detecting Redundant Suggestions in a Search Result Webpage

IP.com Disclosure Number: IPCOM000201722D
Publication Date: 2010-Nov-19
Document File: 4 page(s) / 41K

Publishing Venue

The IP.com Prior Art Database

Related People

Umut Ozertem: INVENTOR [+2]

Abstract

A method and system is provided to detect redundant suggestions in a search result webpage. The detected redundant suggestions are then eliminated using a greedy algorithm.

This text was extracted from a Microsoft Word document.
This is the abbreviated version, containing approximately 52% of the total text.

Method and System for Detecting Redundant Suggestions in a Search Result Webpage

Abstract

A method and system is provided to detect redundant suggestions in a search result webpage.  The detected redundant suggestions are then eliminated using a greedy algorithm.

Description

Disclosed is a method and system for detecting redundant suggestions in a search result webpage.  A greedy algorithm is designed for eliminating the redundant suggestions.

The method and system may be implemented in pre-submit mode, post-submit mode, and for suggesting related concepts.  In the pre-submit mode and the post-submit mode, suggestions are ranked based on query frequency.  In order to suggest related concepts, query reformulation frequency is used to rank suggestions.  For example, if a user types "netf", then the first suggestion may be "netflix".  On suggesting the query "netflix", there is less significance in suggesting "netflix rentals", "netflix movies" etc., which may be replaced with more useful suggestions.

The method and system builds a query-URL click graph using click data in search logs for a predefined period of time.  For example, the query-URL click graph may be built using click data in search logs for the last 6 months.  The edges in the query-URL click graph are the clickthrough values corresponding to queries.  In an instance, for a particular URL "u" and a query "q" we have,

In the above equation, c(u) is a binary random variable, denoting whether the URL "u" is clicked or not, and are the number of times "u" is displayed and clicked for query "q", respectively.

Similarly, a query-URL view graph is built using view data in search logs for a predefined period of time.  For example, the query-URL view graph may be built using search logs for the last 6 months.

The method and system uses the Discounted Cumulative Gain (DCG) as the information retrieval metric, given by

In the above equation, "r" represents the rank of the results, and the typical values for "N" are 1,3 and 5.  Note that the "DCG" formula can be considered as the inner product of two vectors, the gain (relevance) vector "rel" and the discount vector

where the relevance vector represents the gain, and the discount vector represents the examination probabilities.  In the query-URL view graph, the edges are discount values that "DCG" uses.  However, since for a given period of time, a URL might have been returned in more than a single position, we use the average rank discount given by

where "K" is the number of times the URL "u" is returned for the query "q" in the session logs.

The query-URL click graph and the query-URL view graph are then used to define a utility model for each suggestion so that the added value of each suggestion depends on the suggestions that are presented earlier.  The utility model is used for detecting redundant suggestions in the s...