Browse Prior Art Database

Determining Entity Relationships with Weighted Parse Tree Fragmentation

IP.com Disclosure Number: IPCOM000248923D
Publication Date: 2017-Jan-22
Document File: 5 page(s) / 533K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method to improve the understanding of relationships between conceptual entities in written language. Weighted parse tree fragmentation divides sentences into smaller fragments with the help of weighted conjunctions and punctuation marks, which enables the identification of relationships between entities.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

1

Determining Entity Relationships with Weighted Parse Tree Fragmentation

For complex domains, such as medical, it is very important to find relationships between entities to provide a higher-level picture of concepts to support accurate predictions. For instance, in an electronic medical record, it would be helpful to make connections between sizes and locations in order to understand what type of measurement is encountered. This helps build more complete concepts that have more information than multiple small entities have. The more connected the entities are, the more complete entities become.

The novel contribution is an approach to sentence fragmentation that divides sentences into smaller fragments with the help of conjunctions and punctuation marks. As a result, each fragment is more coherent in terms of the entities it contains; it is a smaller and more coherent container. The likelihood of two entities being related to each other increases as the entities appear in the same fragment. Each conjunction and punctuation mark has an assigned weight, either increasing or decreasing the overall score representing the likelihood of a relationship between two entities.

With this approach, every conjunction/punctuation mark generates two fragments. If the weights are negative and the entities are in separate fragments, then the score decreases. If the weights are positive, then the score increases. Weighting and scoring the fragments and entity relationships indicates the strength of the entity relationships and those that should be used, depending on the scenarios.

For implementation, the system considers the conjunctions and punctuation marks and divides sentences into smaller fragments, assuming that nodes that share a fragment are more likely to belong together than nodes in different fragments. The fragment scoring is based on the weight of each conjunction/punctuation encountered. It is possible to decrease or increase the score based on individual weights.

A computed weight for a conjunction or punctuation mark is an analysis of applicability of certain conjunctions and punctuation marks in relation to the concept or domain. (Figure 1)

Figure 1: Algorithm

2

The initial score is determined by the parsing algorithm that determines the likelihood of two entities bei...