Browse Prior Art Database

Algorithm for the Detection of String Mapping Functions

IP.com Disclosure Number: IPCOM000081257D
Original Publication Date: 1974-Apr-01
Included in the Prior Art Database: 2005-Feb-27
Document File: 3 page(s) / 65K

Publishing Venue

IBM

Related People

Damerau, FJ: AUTHOR

Abstract

The algorithm shown in flow chart form will, when given a list of pairs, each member of which is a string of characters, determine if, for any of the pairs, one member can be derived from the other by a set of substitutions of groups of characters from one string for groups from the other string. The substitutions are constrained by the requirement that each substitution must occur in the derivation sequence of more than one string pair.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 68% of the total text.

Page 1 of 3

Algorithm for the Detection of String Mapping Functions

The algorithm shown in flow chart form will, when given a list of pairs, each member of which is a string of characters, determine if, for any of the pairs, one member can be derived from the other by a set of substitutions of groups of characters from one string for groups from the other string. The substitutions are constrained by the requirement that each substitution must occur in the derivation sequence of more than one string pair.

The need for such an algorithm arises naturally in the comparison of related languages, from which the example below is drawn, but is believed to be potentially useful in other applications in artificial intelligence, such as machine learning, computer inference, and so forth. The common characteristic of these problems is that the computer program is required to detect relationships between two sets of data, given only primary data and general guiding principles for finding such relationships, i.e., principles which are independent of the data given as input.

Consider the following set of word pairs from Dutch and

Swedish. Although the languages are different, they are related to each other by virtue of being divergent developments from a common ancestral language. As a result, some of the words in these languages are related to each other in systematic ways. 1. arm:arm a:a - r:r - m:m

2. dag:dag d:d - a:a - g:g

3. goed:god g:g - oe:o - d:d

g:g - o:o - ed:d

4. wit:vit w:v - i:i - t:t

5. tijd:tid t:t - ij:i - d:d

6. wind:vind w:v - i:i - n:n - d:d

7. moeder:moder m:m - oe:o - d:d - e:e - r:r

m:m - o:o - ed:d - e:e - r:r

m:m - o:o - e:d - de:e - r:r

m:m - o:o - e:d - d:e - er:r

8. broeder:broder b:b - r:r - oe:o - d:d - e:e - r:r

b:b - r:r - o:o - ed:d - e:e - r:r

b:b - r:r - o:o - e:d - de:e - r:r

b:b - r:r - o:o - e:d - d:e - er:r

9. bloed:blod b:b - l:l - oe:o - d:d

b:b - 1:1 - o:o - ed:d

10. bij:bi b:b - ij:i

11. blind:blind b:b - 1:1 - i:i - n:n - d:d.

On the right are the sets of substitutions which relate one word to another. The substitutions with their frequencies are: 1. a:a 2 11. t:t 2

2. r:r 3 12. ij:i 2

3. m:m 2 13. n:n 2

4. d:d 8 14. e:e 2

5. g:g 2...