Browse Prior Art Database

Scoring terms in a question

IP.com Disclosure Number: IPCOM000013933D
Original Publication Date: 2000-Mar-01
Included in the Prior Art Database: 2003-Jun-19
Document File: 1 page(s) / 40K

Publishing Venue

IBM

Abstract

In many cases, a question posed in English contains a significant work which implies what type of answer is desired.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 54% of the total text.

Page 1 of 1

Scoring terms in a question

    In many cases, a question posed in English contains a significant work which implies what type of answer is desired.

For instance, the question "Where is the restaurant Lutece located?" implies that it is desired to find an address or location of the restaurant. However, for some questions the leading word can lead to an open ended list of possible answer types. An example of such a question is "What store carries the blue dress?". The problem is to determine which terms in the question have more significance than others. As a consequence, a weighted (by the significance measure of the terms) bag of words query against a corpus would return as its highest ranking docu- ments which contain the important terms which improves the likelihood that they will contain the correct answer. In this disclosure we suggest a significance weighting of a question. In particular we assert that the first term (not including stop words) after the query word (i.e. WHAT) has more significance than the others.

The traditional approach to weighting terms in a search in IR is to use a tf*idf function or a variant thereof. The tf term measures the number of occurrences of a query term in a docu- ment, and the idf measures how few documents contain a mention of the term. These factors influence how relevant the document is to the query, but don't address at all how intrinsically important the various query terms are to the meaning of (and hence answer to) the q...