The following article describes a technique using a cognitive system to assemble multiple answers from the cognitive system into a single well formed response.

Using a cognitive system to assemble multiple answers into a single response

Question Answering systems such as IBM Watson* rely upon a corpus of data to answer user questions. Typically the Q/A system finds passages within the corpus, and uses those passages to answer questions asked by the user. It is often the case that several 'similar' passages are found within the corpus that contain similar information but that are not identical. In many cases the similar passages will contain complementary data. Current Q/A systems will return what it considers to be the best match passage. This invention proposes to introduce a system that will allow Q/A systems such as Watson to compile multiple responses into a single, complete, answer. This will be particularly useful for procedural answers where multiple steps are involved.

The basic idea behind this invention takes into account that Q/A systems like Watson typically return multiple responses (with varying confidence) for any given question. The system does not currently examine the responses to identify information that may be contained in a lower confidence response even though such information may be useful if added into the higher confidence answer. The invention calls for the reuse/creation of algorithms which are capable of comparing documents and passages for similarity. Algorithms such as those used to detect plagiarism could be used to detect similar passages. Once similar passages are detected in two responses then further analysis is performed. The "further analysis" will determine if the additional information in the lower rank passage will "fit" in the higher confidence answer. Once it is determined that the additional information will form a sensible response, the additional information is added into the original response, and the answer is returned to the user.

The advantage of such a system is that answers being presented to the user will be more complete, and will potentially include information from multiple documents as opposed to one document.

As described above this invention calls for the use of existing algorithms (such as a plagiarism detection algorithm) to compare responses for similarities. Once similarities are identified an algorithm would need to be put in place to ensure that the parts of the document/passage unique to the lower confidence response 'make sense' when inserted into the higher confidence answer. If part of the lower confidence response already exists in the higher confidence answer, that part will be omitted when combining in favor of the higher confidence answer. This system will be of particular benefit in procedural responses where adding complimentary steps to the procedure could greatly benefit the end user.

One potential implementation of this would revolve around finding procedures within a clustered set of documents. The first step would be to categorize the corpus itself. So as part of the ingestion process, documents would be clu...