Browse Prior Art Database

Method of composing appropriate result text for a language response system based on a complexity metric

IP.com Disclosure Number: IPCOM000235643D
Publication Date: 2014-Mar-17
Document File: 3 page(s) / 65K

Publishing Venue

The IP.com Prior Art Database

Abstract

The Big Data technologies of today allow access to large amounts of information through search and analytics. An interesting challenge involves not just in returning results buy making the digestible. The result text for the highest score of a search is the same regardless of the knowledge base of the user making the query.  This article describes a method of developing a complexity metric used for providing a custom, simplified, answer text.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 01 of 3

Method of composing appropriate result text for a language response system based on a complexity metric

A Question-Answering system that uses natural language like IBM Watson* currently provides results sorted based on a confidence score. The result text (for the highest score for example) is the same regardless of the knowledge base of the user making the query. The user would get much more value out of the answer (and the system) if it were presented at a level they could understand. This could result in the same answer presented in different ways to different users. For example, the description of appendicitis for a Nurse or Doctor would be different than for a non-medical professional.

A document or text's complexity metric value is based on the structure and language within the text. The metric is computed using a variety of features of the text including word choice, rarity, composition, parts-of-speech, etc. Given a target metric, a piece of text with a higher score could be reworded to match the target score.

By applying a metric to the resulting answer text, a complexity score is determined and can be matched to a target complexity level. The target complexity level could be determined by a value given previously by the user, deduced or calculated from previous or similar queries by this user, or matches a users personal complexity score from a stored profile. If no target complexity level is given passages of text could be highlighted with corresponding scores allowing the user to selectively simplify them.

Rewording text to a target complexity metric would involve using simpler synonyms or more sentences to convey difficult adjectives or other modifiers. The baseline implementation could simply provide definition links to all non-simple words (e.g. words above the complexity metric). Rewording text always introduces the potential loss of accuracy and so the original text is not replaced but rather the simpler text made available.

Alternately, the resulting text could be maintained as is but further explanation or instructions could be added to the response to clarify phrases or wording with a high complexity score. The results could be displayed sorted by accuracy and then complexity with words and phrases highlighted if they are above a predetermined complexity score.

An example of a formula that can be used to calculat...