Browse Prior Art Database

Alignment of Phonemes With Corresponding Orthography

IP.com Disclosure Number: IPCOM000038745D
Original Publication Date: 1987-Feb-01
Included in the Prior Art Database: 2005-Feb-01
Document File: 3 page(s) / 15K

Publishing Venue

IBM

Related People

Lawrence, SG: AUTHOR

Abstract

This disclosure is concerned with English language spelling and pronunciation. A table listing correspondence between phonemes and orthography, including letters not pronounced, is given. An algorithm provides alignment of phonemes in a word with its orthography by successively reducing the letters in the word until a match is found. The technique may have application in speech synthesis and speech recognition. As part of a speech synthesis project a technique has been developed for aligning the phonemes in a phonemic transcription with the graphemes in a word.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 3

Alignment of Phonemes With Corresponding Orthography

This disclosure is concerned with English language spelling and pronunciation. A table listing correspondence between phonemes and orthography, including letters not pronounced, is given. An algorithm provides alignment of phonemes in a word with its orthography by successively reducing the letters in the word until a match is found. The technique may have application in speech synthesis and speech recognition. As part of a speech synthesis project a technique has been developed for aligning the phonemes in a phonemic transcription with the graphemes in a word. For example, creationism/kr n z m/ can be aligned as

(Image Omitted)

The technique has been used to determine in a machine-readable dictionary, where an entry is given with multiple orthographies and one pronunciation, whether the pronunciation may be used for all the orthographies. For example, in the Collins English Dictionary there appears the entry 'abutment// b tm nt/ or abuttal'.

We know these two words are synonymous but are not pronounced the same. a b u tt aligns with / b t / but the al cannot be aligned with /m nt/. The pronunciation of abutment cannot be used as the pronunciation of abuttal. The disclosure consists of two parts; a table of phonemes and their corresponding orthography, an algorithm for producing the alignment of the phonemes in a word with the orthography. The algorithm has been checked against the 33,000 most frequently used words in published English texts. Algorithm The table of phonemes and their corresponding orthography is held in descending lengths of orthographic equivalences. The word and its pronunciation are matched against the table of correspondences. The table of correspondences is scanned to match the largest number of letters that produces the corresponding phonemes. If an orthography entry in the table begins with a vowel and the corresponding phoneme entry begins with a consonant, then the preceding phoneme for the word cannot be a vowel. This check ensures the correct alignment of syllabic consonants. When a match is found, a blank is inserted after the letters in the word and the corresponding phonemes in the pronunciation. If a match is not found, then a scan of the silent letters is performed. If the letter is silent then a '-' is inserted into the pronunciation string. It should be noted that a silent 'r' cannot occur after a vowel. If no correspondence can be found, then if there has been a match with more than one letter, the multiple letter string is decreased by one and a search of the correspondence table is redone though this time the length of the matching string is limited to the length of the shortened multiple letter string. If after having reduced the multiple letter string to a single letter and not found a match, then there is either an error in the correspondence table or the pro...