Browse Prior Art Database

System and Method for Deriving Meaning of Novel Words

IP.com Disclosure Number: IPCOM000247056D
Publication Date: 2016-Jul-31
Document File: 2 page(s) / 94K

Publishing Venue

The IP.com Prior Art Database

Abstract

A system and method for deriving the meaning of novel words is disclosed in a natural language processing (NLP) system..

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 53% of the total text.

Page 01 of 2

System and Method for Deriving Meaning of Novel Words

Disclosed is a system and method for deriving the meaning of novel words in a natural language processing (NLP) system

News sources and social media are constantly creating novel words that contextually imply significant information. NLPs do not find these words in their lexicon and would normally mark them as a noun and not found. This can cause the parser to see valid sentences as incomplete, and by leaving the words as a noun, not found, it can hinder the performance of various parts in the processing pipeline.

Instead, when a potential novel word W (for purposes of this disclosure, a word not in the parser lexicon) is identified, a mapping M of previously identified novel words and their potential meanings is searched, and if a match is found, it is used. If there is no match in the dictionary, the algorithm searches the corpus for sentences or passages that are most similar to the sentence containing the novel word, P_w. This search considers both sentence structure and known entity types from the sentence P_w. These sentences are ranked and the best N are kept. These N sentences are analyzed to identify tokens T that have a similar position within their sentence as W. The set of T is used to identify the most common syntactic purpose, which is then added to the

mapping M and used as the likely purpose of the word W. This could be used both as a post-ingestion step and on every query.

Existing approaches...