Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Using Keypress Chronology to Spell-Correct Words with Transposed Characters

IP.com Disclosure Number: IPCOM000113095D
Original Publication Date: 1994-Jul-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 6 page(s) / 188K

Publishing Venue

IBM

Related People

Fisher, JO: AUTHOR [+3]

Abstract

Disclosed is an invention which can increase the usability and speed of realtime or batched spell-checking systems which offer spell-correction. The technique relies on the detection of overlapping keypresses while a character sequence is being typed in order to reduce the number of character sequence permutations checked against a list of valid words as part of an anagram search.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 32% of the total text.

Using Keypress Chronology to Spell-Correct Words with Transposed
Characters

      Disclosed is an invention which can increase the usability and
speed of realtime or batched spell-checking systems which offer
spell-correction.  The technique relies on the detection of
overlapping keypresses while a character sequence is being typed in
order to reduce the number of character sequence permutations checked
against a list of valid words as part of an anagram search.

      People, when typing words, often misspell due to the
transposition of letters.  Perhaps the most famous example of this is
mistyping "the" as "hte".

To clarify further discussion, the following terms are defined:
          dictionary - This is a list of all valid (non-misspelled)
                       words.
whitespace character - This is a character which is not part of
                       any word in the dictionary.  Examples could be
                       characters such as period (.), comma (,), or
                       asterisk (*).
           protoword - This is a character sequence without
whitespace
                       characters.
                word - This is a protoword that is in the dictionary.
            non-word - This is a protoword that is NOT in the
                       dictionary.

      A typical spell-check algorithm might check each protoword
typed by the user against the dictionary, categorizing each protoword
as a word or non-word.  If the protoword is a word, it is assumed
correct and left alone; if a non-word, the algorithm might attempt an
"anagram search" which will permute the protoword into all possible
unique combinations of its characters, generating secondary
protowords.  The list of secondary protowords is called the
"Spell-Check List".  The algorithm will then consult the dictionary
to laboriously categorize each secondary protoword as a secondary
word or secondary non-word.  All secondary words are used to create a
list of candidate words, which is called the "User Choice List".
Typically, the algorithm will let the user select the intended word
from this User Choice List.

      Fig. 1 illustrates this "typical" technique, using the
misspelling of "ILVSE" for the intended spelling of "LIVES".  Shown
are the Spell Check List 100 and the User Choice List 101 for this
misspelling.

This "typical" method has the following disadvantages:

Blind - This method does not take into account the proximity of
transposed letter keypresses, which, as shall be shown, can be used
to advantage.

Time Consuming - The number of unique permutations for a protoword is
m!/(c1!  c2!  ...  ck!), where m is the number of characters in the
protoword and c1..ck are the counts of each character that appears in
the protoword more than once.

      As exampl...