Browse Prior Art Database

Method for syllable calculation of English Language native words for the purpose of providing a language fog index Disclosure Number: IPCOM000035214D
Original Publication Date: 2005-Jan-20
Included in the Prior Art Database: 2005-Jan-20
Document File: 5 page(s) / 58K

Publishing Venue



This disclosure describes a method for determining the syllables within a written English language word or phrase. Analysis of English language text is potentially complex and seemingly subject to arbitrary rules given its diverse ethnic origin. However, we demonstrate here a systematic approach that limits the number of special cases to be handled to all but a very few. This approach is illustrated with a REXX program. It may be readily applied to the calculation of the so-called fog index, a determiner for written English language clarity based on averages of words per sentence and syllables per word.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 29% of the total text.

Page 1 of 5

Method for syllable calculation of English Language native words for the purpose of providing a language fog index

The Fog index relies on calculating averages of words per sentence and syllables per word. The former is a trivial calculation; the latter is not. This disclosure shows a working REXX language prototype for calculating the number of syllables in a word or sentence, which may readily be extended to calculate the fog index.

    Multiples syllables within a word occur because successive clusters of vowels and consonants cause the vocalised sound to be interrupted. Since vowels are voiced in general and consonants are not then an approximation to the syllable count can be found by calculating the number of vowel clusters within a word. I define a vowel cluster as a contiguous sequence of vowels; similarly I define a consonant cluster as a contiguous sequence of consonants. This the word: "broken" has clusters: br-o-k-e-n. There are two vowel clusters and two syllables. This is straight forward but there are a number of complications that need to be accounted for; these are: diphthongs - single vowel clusters that have two syllables e.g. dual the semi-vowel "y" which when embedded in a vowel cluster does not create a new syllable and conversely when imbedded in a consonant cluster does. E.g. stay, system the silent final "e" the plural syllable -es as in "sixes" but not in "bates" the fact that "battle" has 2 syllables while "bathe" has 1. the elision of -sed as in "pleased" voiced consonant combinations as in "rhythm" 2 syllables compound words separated by hyphens contracted words using apostrophes

    The crux of this method lies in how these exceptions are handled in a systematic way.

The procedure is thus:
1) hyphenated words are handled by treating hyphens as blanks and thus count the syllables of the separate words.
2) semi-vowels and contractions are handled by regarding A, E, I, O, U, Y and ' as vowels. (Note: the apostrophe plays a key role in what follows.)
3) We use apostrophe insertion to cater for polysyllabic consonant clusters
4) False diphthongs are handled by removing a vowel
5) We use K-insertion to cater for polysyllabic vowel clusters
6) The vowel clusters are counted.
7) -E, -ES, -ED endings are dealt with and the count adjusted accordingly. Steps 1 and 2 are self-evident.

Step 3: Polysyllabic Consonant Clusters:

    Before counting the number of vowel clusters we identify polysyllabic consonantal clusters and vowel diphthongs. These are: THM as in RHYTHM 2 syllables instead of 1 BBLY as in NOBBLY 3 syllables instead of 2 DDLY as in CUDDLY 3 syllables instead of 2

    At the beginning of the additional syllable in these types of cluster we insert an apostrophe thus:

Page 2 of 5



    This action increases the vowel cluster count by one. Step 4: False Diphthongs:

    Diphthongs are similarly handled, but before doing so we break false diphthongs by removing a vowel from the cluster thus: TION -> TIN as in ACTION 2 syll...