Browse Prior Art Database

Method For Recognizing Character String Similarity

IP.com Disclosure Number: IPCOM000078231D
Original Publication Date: 1972-Dec-01
Included in the Prior Art Database: 2005-Feb-25
Document File: 1 page(s) / 11K

Publishing Venue

IBM

Related People

Kellerman, E: AUTHOR

Abstract

This improvement applies to an algorithm for recognizing character string similarity.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 100% of the total text.

Page 1 of 1

Method For Recognizing Character String Similarity

This improvement applies to an algorithm for recognizing character string similarity.

The improvement involves an added calculation that is required when the character strings being compared have greatly differing lengths. These greatly differing lengths result, in some cases, in unwarranted high-correlation coefficients in the original algorithm.

The added calculation consists of multiplying the correlation coefficient obtained by the original algorithm, by the fraction of letters matched from the longest string raised to a power. The power should be larger than one if the percentage of letters matched is to be given a large weight. Similarly, the power should be smaller than one if the percentage of letters matched is to be given a small weight. The actual power used is to be determined by the specific application to which the algorithm is applied.

1