Browse Prior Art Database

Use of taxonomy in improving search results

IP.com Disclosure Number: IPCOM000015324D
Original Publication Date: 2001-Nov-10
Included in the Prior Art Database: 2003-Jun-20
Document File: 3 page(s) / 59K

Publishing Venue

IBM

Abstract

Use of taxonomy in improving search results Taxonomies have been used in many different applications in the area of browsing or searching a collection. When used during browsing the user is presented with a set of categories that are part of the taxonomy (or taxonomies), the user selects one or several of them and proceeds exploring the data space in that fashion. Clearly, this approach lends itself to exploring both flat and hierarchical taxonomies. Similarly, taxonomies can be exploited in a search scenario. The user specifies a search, the user or the system specify a set of categories within which the search is to be performed and then the search is executed. The just described scenario has a few issues. In general, users do not want to have to specify a category partly due to inertia and partly due to not knowing what to specify. Many users want to use search to explore the data, but at the same time are frustrated when the results do not satisfy their needs.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 3

Use of taxonomy in improving search results

Use of taxonomy in improving search results

Taxonomies have been used in many different applications in the area of browsing or searching a collection. When used during browsing the user is presented with a set of categories that are part of the taxonomy (or taxonomies), the user selects one or several of them and proceeds exploring the data space in that fashion. Clearly, this approach lends itself to exploring both flat and hierarchical taxonomies.

Similarly, taxonomies can be exploited in a search scenario. The user specifies a search, the user or the system specify a set of categories within which the search is to be performed and then the search is executed.

The just described scenario has a few issues. In general, users do not want to have to specify a category partly due to inertia and partly due to not knowing what to specify. Many users want to use search to explore the data, but at the same time are frustrated when the results do not satisfy their needs.

Before presenting our new approach, we want to describe a well-known technique called relevance feedback. Roughly speaking, based on a user query, a search is performed and a hit list is returned. The system then examines the hit list, determines the most important words (phrases) from the hit list, adds these words into the original query and reissues the query. Many studies have been done in this area. Different approaches use different algorithms to pick the "important words" from the hit list. However, it seems that this approach in general returns more relevant results for searches than a single path approach.

We propose a new approach using taxonomies to obtain more relevant search results in less time than conventional two pass approaches.

First, there are some basic observations:

1) Given a user query, if it is augmented with an appropriate set of words or phrases, the search results are improved

2) Search where the data set is restricted to a particular category gives in general results that are more relevant if this category can be either automatically deduced or specified by the user.

Our novel observation is to combine these two approaches by finding the appropriate additional

1

Page 2 of 3

words to augment the query using a taxonomy.

Given a dataset and a taxonomy, a category can be assigned to each document. (Note, that an artificial "NO_CATEGORY" category can be added to assign a category...