Browse Prior Art Database

Improved Measure of Readability

IP.com Disclosure Number: IPCOM000117008D
Original Publication Date: 1995-Dec-01
Included in the Prior Art Database: 2005-Mar-31
Document File: 2 page(s) / 75K

Publishing Venue

IBM

Related People

Lewis, JR: AUTHOR

Abstract

Disclosed is a method for assessing the readability of passages of text, in which semantic and syntactic components are combined to form a Reading Difficulty Index. The semantic component is based on a count of infrequent words, while the syntactic component is based on counting sentence structures that require trace processing.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Improved Measure of Readability

      Disclosed is a method for assessing the readability of passages
of text, in which semantic and syntactic components are combined to
form a Reading Difficulty Index.  The semantic component is based on
a count of infrequent words, while the syntactic component is based
on counting sentence structures that require trace processing.

      Virtually all conventional methods for assessing the
readability of a passage are based on combinations of a semantic
component, derived from the lengths of words in the passage, and a
syntactic component, derived from the lengths of sentences in the
passage.  On the other hand, research among readers of computer user
guides has shown that reader satisfaction corresponds more closely
with an alternative readability measure, known as the "Cloudiness
Count," which is based on a combination of a semantic component
derived from counting words within a lexicon of "empty" words, and a
syntactic component, based on the number of verbs in the passive
tense.  In this context, "empty words" are examples of a special
group of infrequently-used words, sometimes called "abstract" words,
which often appear in business and technical writing without
substantial meaningful content, such as "system" and "documentation."
Such words appear rarely in other forms of English writing and
speech.

      Furthermore, research consistently shows that it is harder for
people to extract the meaning from a passive sentence than from its
active counterpart, having the familiar sequence of subject, verb,
and object, as in "The boy chased the girl."  In a passive sentence,
the object is followed by a verb showing the passive morphology, in
which the past tense of the verb is preceded by an appropriate
conjugation of the infinitive "to be."  This verb is optionally
followed by the subject.  It is believed that passive sentences are
harder to process than active sentences because a reader or listene...