Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Suffix-Dependent Hyphenation Data Storage Technique

IP.com Disclosure Number: IPCOM000042328D
Original Publication Date: 1984-May-01
Included in the Prior Art Database: 2005-Feb-03
Document File: 1 page(s) / 12K

Publishing Venue

IBM

Related People

Carlgren, RG: AUTHOR

Abstract

This technique provides for the minimization of storage required to represent the valid hyphenation points for a list of words which are encoded using a stem word and suffix list storage. An efficient technique for storing a large list of words having the same base language is to represent common variations of basic stem words as a list of suffix variations. For the valid hyphenation points for such a stem word and variant list to be accurately represented, it must be possible to define the hyphen points for the stem word and also the manner in which each stem variant word is to be hyphenated. This problem is common to at least all European and Semitic languages, and is particularly noticeable in English.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 1

Suffix-Dependent Hyphenation Data Storage Technique

This technique provides for the minimization of storage required to represent the valid hyphenation points for a list of words which are encoded using a stem word and suffix list storage. An efficient technique for storing a large list of words having the same base language is to represent common variations of basic stem words as a list of suffix variations. For the valid hyphenation points for such a stem word and variant list to be accurately represented, it must be possible to define the hyphen points for the stem word and also the manner in which each stem variant word is to be hyphenated. This problem is common to at least all European and Semitic languages, and is particularly noticeable in English. The present technique is to provide a set of rules by which the hyphen points within a stem word might be changed according to the particular suffix and according to existing stem word hyphenation points. The hyphenation points for the stem words in the stored dictionary are represented in a relational data base which exists as an independent segment of the word list dictionary. These hyphen points are stored as described in [*]. The first step in obtaining the hyphen points for a word is to obtain the hyphen points of the stem of that word.

To obtain the hyphen points for a word, the data block containing the required hyphen point representation must first be read into main storage. The block location is determined by first locating the word in the main dictionary. This provides its relative stem number within the dictionary and identifies the particular stem suffix, if any. The hyphen points for...