Browse Prior Art Database

Coding Scheme for Partial Representation of Natural Language Sentences

IP.com Disclosure Number: IPCOM000117618D
Original Publication Date: 1996-Apr-01
Included in the Prior Art Database: 2005-Mar-31
Document File: 2 page(s) / 79K

Publishing Venue

IBM

Related People

Haefner, J: AUTHOR [+3]

Abstract

Disclosed is a method to convert whole sentences of natural language into a binary matrix format, where the columns and/or rows of the matrix correspond to the alphabet of the concerned language.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 74% of the total text.

Coding Scheme for Partial Representation of Natural Language Sentences

      Disclosed is a method to convert whole sentences of natural
language into a binary matrix format, where the columns and/or rows
of the matrix correspond to the alphabet of the concerned language.

      Each entry of the matrix is characterized by an index and
initially set to zero.  The entry values of the matrix are determined
by one or more letters of the words; e.g., the initial letter(s).
For example, the entry with index (1) or (1,1) etc., becomes '1' if
the sentence contains a word beginning with letter 'A' or 'AA'.  The
dimension of the matrix is determined by the number of regarded
letters.  Words with identical letters are counted only as one word.

      Fig. 1 exemplary shows a one-dimensional matrix being
established for the German sentence "Der Esel von Joe frisst Erna's
Blumen".  For the two words "Esel" and "Erna", only one entry is
recorded due to the identical initial letters "E".  The matrix
depicted in Fig. 2 is based on two initial letters of the respective
words of the same sentence.  By this two-dimensional code, it is
distinguished between the above two words.

      Each sentence and its special combination of words is assigned
to a definite machine storage.  The natural language sentence is
stored in a first data base for sentences, and the amended matrix in
a second data base for code.  Linkage of both entries is accomplished
by a data base adminis...