Browse Prior Art Database

Extensible Method for Criteria-Driven Answer Scoring in a Deep Question Answering System Disclosure Number: IPCOM000239037D
Publication Date: 2014-Oct-02
Document File: 4 page(s) / 37K

Publishing Venue

The Prior Art Database


Disclosed is an extensible method for criteria-driven answer scoring in a deep question answering system.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 36% of the total text.

Page 01 of 4

Extensible Method for Criteria - System

In deep question answering systems (deep QA), like IBM Watson**, various analysis programs are run against both the question text and candidate answers (i.e., text passages extracted from documents in a corpus) in order to deduce a probable correct answer.

    In IBM Watson terminology, a "pipeline" represents the execution of these various analysis programs. A typical IBM Watson pipeline is comprised of the following main steps: a. Question analysis - this involves analyzing and annotating the question to identify key attributes to search for.

b. Primary search - this involves searching for documents in the corpus using key attributes from the question analysis phase.

c. Candidate answers generation - this involves identifying key matching passages from the search results.

d. Supporting evidence retrieval - this involves retrieving additional supporting evidence for top candidates.

e. Scoring - this phase involves scoring the various candidates and finally selecting the correct answer.

    The scoring phase of an IBM Watson pipeline calls various scoring algorithms to help deduce the correct answer. A scoring algorithm can generate one or more feature scores to indicate how confident it is in its answer. IBM Watson was designed to use a training phase to learn which features or combinations of features are best at predicting the right answers for different types of questions. Once the system has been properly trained subsequent questions flowing through a pipeline will utilize the machine-learned model for finding the most likely correct answer. In essence, multiple features work in concert, meaning for example, that a particular feature may have a high weight in relation to other features for one type of question but a low weight in relation to other features for another type of question.

    The criteria-driven answer scoring is a method in the answer scoring phase to address important use cases where the corpus consists primarily of policy and guideline documents containing criteria-based passages. These documents outline specific criteria that must be met to be considered as a high confident candidate answer. For example, if the guideline documents are potential treatments to choose from for a given patient, then each individual criteria of a treatment needs to be accurately evaluated and scored.

    Because much of the content in these types of criteria-driven documents are specific to a particular policy (and often require specific calculations and comparisons) traditional Natural Language Processing (NLP) scoring techniques, for some criteria, may produce false positive results. Instead, a kind of hybrid approach is needed to combine traditional NLP techniques with the identification of specific criteria that must be met. Further more, while there are various logical relationship evaluators that provide a general solution in scoring conditional expressions there rare complex cases that they fall sh...