Browse Prior Art Database

Improved spell checking algorithm for email and word processors

IP.com Disclosure Number: IPCOM000013123D
Original Publication Date: 2003-Jun-13
Included in the Prior Art Database: 2003-Jun-13
Document File: 2 page(s) / 44K

Publishing Venue

IBM

Abstract

The goal of the spell check function is to identify only the words that are misspelled. Doing a spell check in Lotus Notes becomes frustrating because many of the words that are correct are flagged as incorrect. This is due the existing paradigm that the spell check is cross referenced only against the dictionary known to be correct. Usually, this dictionary is shipped with the product. The new paradigm is needed that would improve the spell checker's ability to correctly identify misspelled words.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 2

Improved spell checking algorithm for email and word processors

    This invention relies on the use of multiple "trusted" data sources to aid in determining which words are spelled correctly. Today, the spelling data set (collection of words) is the only data set that is trusted to have correct spelling.

For the purpose of this document, the accuracy of the spell checker is identified as the ratio as follows:

       #of words changed by the user via the spell checking GUI ---------------------------------------------------------------------------------------------------- x 100 = Spell Checker Accuracy (%)

Total words flagged as incorrect by the spell checker

Duplicate words flagged by the spell checker are counted only once. Same for duplicate words changed by the user. It is impossible for the dictionary to be up to date since the human language is constantly changing and new words are constantly being created. Instead, spell checkers need to rely on additional sources for improve their accuracy.

This disclosure does not do away with the use of the spelling "trusted" dictionary, but rather suggest that when it is supplemented with data from other "trusted" sources, the accuracy of the spell checker will be improved. Since the vocabulary is different for each user, one way to improve accuracy for a user is to rely on "trusted" sources that the user manages (e.g. email address book). The logical sequence of steps for the spell checker should be as follows:

Spell checker invoked by manual or automated means

Access the first data source from the list of data sources

 Cross reference "misspelled" words with data source to narrow list of misspelled words

Access the next data source from the list of data sources

Yes

Any words not found in data source ("misspelled")?

No

        Any more data Yessources to search?

No

All words spelled correctly !! (done)

Invoke spell check GUI as done today

1

[This page contains 9 pictu...