Browse Prior Art Database

Context Sensitive Dictionary

IP.com Disclosure Number: IPCOM000250134D
Publication Date: 2017-Jun-02
Document File: 3 page(s) / 198K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a system for automatically clarifying the definition of a term, within a given body of text, which has multiple meanings based on the reader’s original language and experience, the time of the writing, and the dialect of the author. The system automatically analyzes the text that a user is reading to determine the context in which it was written, and then upon request provides the reader context-sensitive definitions for selected words or phrases.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 45% of the total text.

1

Context Sensitive Dictionary

Within a body of text, the meanings of some terms can differ depending on the reader’s

original language and experience, the time of the writing, and the dialect of the author.

The word can have different meaning or multiple dictionary definitions. For example, a

"boot" to an American is a piece of footwear, while to a Briton, it is the trunk of a car.

The novel contribution is a system for automatically identifying the context of the work,

and then using that to provide appropriate definitions for words or phrases on request.

The system automatically analyzes the text that a user is reading to determine the

context in which it was written, and then upon request provides the reader context-

sensitive definitions for selected words or phrases.

The system consists of:

 An analyzer, which parses the text and the associated to determine the most likely context of the author and the text. This specifically includes the year in which the text was written and the native language and dialect of the author.

 A dictionary with definitions for a broad set of contexts, including historical usages and regional variations. The system tags these definitions with the appropriate context for matching.

 An interface by which the reader may select a word or phrase in order to request a definition

 A processor that takes the output of the analyzer and the definitions from the dictionary and uses that information to produce a sorted list of appropriate definitions

Figure: System components

2

When the text is first opened, the analyzer parses it to determine the context. For

providing definitions, the most significant attributes to determine are the year in which

the text was written and the native language (including dialect) of the author. Other

attributes may be helpful for determining the appropriate definitions including the

author's identity (e.g., age, gender, socioeconomic status, etc.) and other works by the

author. The analyzer determines this information by looking at metadata provided with

the text (e.g., the EPUB/MOBI/PDF e-book file formats can have this information

embedded) and the text itself.

The metadata may directly indicate the year in which the text was written via the

copyright date. It is networked, so if the author's name is provided, the system can

query online databases to determine the author's native language. If the metadata

includes the name of the text, it may try to locate additional metadata from other online

databases. It scans the text for patterns indicative of the text being written in a certain

time and place.

Consider an example line from Charles Dickens' A Christmas Carol, "He shed a few

drops of water on them from it, and their good humour was restored directly.” The

phrase "good humour" is rarely used in American English, and the spelling of "humour"

is the British variant. The phrase was more popular in the 19th century than in the 20th,

so using these data points, the analyzer is able...