Browse Prior Art Database

System and method for increasing accuracy based on speaker expertise and question context Disclosure Number: IPCOM000235815D
Publication Date: 2014-Mar-25
Document File: 2 page(s) / 113K

Publishing Venue

The Prior Art Database


This article discloses how the accuracy of a deep question and answering system may be improved by taking into account the speaker or creator of information and their relative expertise given the question context. This subject matter specific expertise is then used to weight the accuracy of the information provided.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 01 of 2

System and method for increasing accuracy based on speaker expertise and question context

When a corpus contains news or other information from print, audio, or video the information could come from a variety of sources. This could be opinions of news reporters, people interviewed off the street, or people directly involved in the topic of the article such as world leaders, corporation CEOs, subject matter experts, etc. The information directly from the people involved likely contains more accurate information than those simply stating their opinion.

If the provider of the information along with their relevancy to the topic can be determined , the weighting of passages can be adjusted to enhance the accuracy of the answer.

The following enabling art is used to associate statements people have made along with their name . Voice and video recognition to identify who is speaking and associate a name with the person (s)

Documentation parsing to identify names in documents. This can be done by identifying quotes in text and determining who spoke the quoted text , or by determining the author of the document, book, paper, etc.

As part of the initial ingestion process, IBM Watson* identifies people and their associations. As an example, Watson may ingest an interview with Mike Shanahan, head coach for the Washington Redskins talking about an upcoming game with the NY Giants . In the interview he talks about the injured reserve and how key players are going to be playing on Sunday. Mike Shanahan would be identified either through configuration or inference from the corpus data that he is the coach for an NFL** team, knowledgeable of football, and head coach for Washington Redskins. He would be assigned a high weighting with respect to future questions related to football and the Washington Redskins . Additionally, other people are identified during the corpus ingestion which identifies them as potential experts for a subject matter . As an example, content for Jordan Sharp, a Las Vegas football analyst and odds maker would also be identified as knowledgeable on football, sports, NFL, etc.

Consider the following example: "Will the Washington Redskins beat the NY Giants for the NFC East Championship?"

This example question has no right or wrong answer, but rather draws upon the spoken opinion of those considered versed experts in football and knowledgeable on the two teams.

This invention identifies people who are associated with a subject. In this case Washington Redskins and NY Giants. This is done by identifying and ranking people's associat...