Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Using Training Question-Answer Pairs for Generalized Soft Duck Typing

IP.com Disclosure Number: IPCOM000244348D
Publication Date: 2015-Dec-03
Document File: 2 page(s) / 29K

Publishing Venue

The IP.com Prior Art Database

Abstract

Described is a method of using training question-answer pairs for generalized soft duck typing.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 01 of 2

Using Training Question -

Hand-written ontologies for databases or other structured resources are difficult to create and maintain, limiting their use for question-answering (QA) systems. By calculating similarity between entities of known and unknown types based on their properties, provisional type scores can stand in for a formal ontology.

    Type information is of great importance to QA systems. For example, in the question "What aircraft has a top speed of mach 3?", one knows that the correct answer must be a type of aircraft. Unfortunately, type information is difficult to obtain. It can be written by hand, or scraped from unstructured resources, but these approaches leave gaps in the knowledge base. One needs a way to automatically induce type information. One can solve this problem without the use of unstructured text, which can be noisy. Instead, one can use only training questions and structured data.

    Similarly, structured entities can be used to induce types. If two entities have a similar database graph surrounding them, then they are considered to be the same type. All the entities that have similar database structures are grouped together. That way, the known type of one entity can be shared among other entities with some degree of confidence.

Basic Algorithm:
1. Start with an entity C of known type T. This may come from an existing ontology or from a training question. Call this entity the TYPE CENTROID.

2. List all the predicates that correspond to the type centroid. In an RDF-triple store, this is all the predicates that originate from the type centroid C. Call this group of predicates the TYPE CENTROID RELATION SET.

3. For some cutoff n, where 0 <= n and n <= 100, identify all entities that have at least n% of the relations in the type centroid relation set.

4. For each entity E that meets these criteria, assign that entity the same type as the type centroid.

    If there are multiple entities of a known type T, then the entity E is considered to be of type T if it meets the above criteria for any C of type T. If one wishes to be more aggressive about the types that are induced, one can use the advanced algorithm, which assigns soft types:

Advanced Algorithm:
1. Start with an entity C of known type T. This may come from an existing ontology or from a training question. Call this entity the TYPE CENTROID.

2. List all the predicates that correspond to the type centroid. In an RDF-triple store, this is all the predicates that originate from the type centroid C. Call this group of predicates the TYPE CENTROID R...