Browse Prior Art Database

Answer Scoring based on Optimal Window Size

IP.com Disclosure Number: IPCOM000013185D
Original Publication Date: 2000-Mar-01
Included in the Prior Art Database: 2003-Jun-17
Document File: 1 page(s) / 40K

Publishing Venue

IBM

Abstract

Disclosed is a system and method of determining optimally sized scoring windows for a ques- tion answering system. In question answering systems that score windows of sentences in a document collection based on their likelihood of containing the answer to the question, using a fixed window size determined a priori produces sub-optimal results. Instead, the window size should be determined dynamically during query processing on a window by window basis in order to identify the best window of sentences for answering the question. The present invention is a technique for answer scoring based on optimal window size, which overcomes the limitations of predefined fixed window sizes and produces better answer results.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 1

Answer Scoring based on Optimal Window Size

    Disclosed is a system and method of determining optimally sized scoring windows for a ques- tion answering system. In question answering systems that score windows of sentences in a document collection based on their likelihood of containing the answer to the question, using a fixed window size determined a priori produces sub-optimal results. Instead, the window size should be determined dynamically during query processing on a window by window basis in order to identify the best window of sentences for answering the question. The present invention is a technique for answer scoring based on optimal window size, which overcomes the limitations of predefined fixed window sizes and produces better answer results.

    The present invention solves the optimal window size problem by using a window scoring procedure that factors in window size and searches the document collection in such a way that optimally sized windows are quickly identified and scored. In any given document with n sen- tences, there are n(n+1)/2 different windows, where a window is a contiguous sequence of 1 to n sentences. Thus, to find the optimal window in the document, all of the windows could be scored in O(n^2) time and the best scoring window selected in constant time.

    The window scoring procedure proposed in the current invention offers a better average case solution. This solution is made possible by the particular window scoring function used in the current invention. The window scoring function is of the form S(w) = T(w) - (w), where T(w) is a function of the weighted query terms that appear in window w, and L(w) is a function of the length of window w. T(w) increases the window score as more distinct query terms appear in the window. It is a weighted binary (or combination match) function. The number of times a term occurs within the window is irrelevant, rather the important consider- ation is whether or not the term occurs at all. This binary score may be modified by a w...