Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Method and System for Identifying Related Entities Based on Entities’ Names

IP.com Disclosure Number: IPCOM000191967D
Original Publication Date: 2010-Jan-19
Included in the Prior Art Database: 2010-Jan-19
Document File: 3 page(s) / 60K

Publishing Venue

IBM

Abstract

Method and system is disclosed for identifying related entities based on names of the entities. The method involves grouping one or more entities with similar names in a group. The names of the one or more entities are analyzed for grouping the one or more entities in a group based on similarity between the names.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 54% of the total text.

Page 1 of 3

Method and System for Identifying Related Entities Based on Entities' Names

Disclosed is a method and system for identifying related entities based on names of the entities. The method involves grouping one or more entities with similar names in a group. The names of the one or more entities are analyzed for grouping the one or more entities in a group based on similarity between the names. An entity can be for example, a file in a file system, a music file in a music library, a record in a database or a resource in a multiple shared resource system.

Consider an exemplary scenario wherein the one or more entities are one or more files in a file system. The method analyzes one or more file names of the one or more files.

 As illustrated in Fig. 1, the method begins analysis by removing version information contained in a file name of a file to obtain a group name for the file. Thereafter, one or more files with the same group name are grouped together.

Fig. 1

As shown in Fig. 1 for each file name special characters are either removed or changed to blanks in the file name. The special characters can be one or more of

_|+-()[]{}/\"~`!@#$%^*?:;,'. For example, consider a file name as

Project

     Scorpion Business-Case ID55 v2.34 Rev F#2.xls. Therefore, in accordance with the method, after removing and replacing the special characters in the file name, the file name is modified as, 20080915 Project Scorpion Business Case ID55 v2 34 Rev F 2 xls.

Thereafter, the modified file name is translated to upper case and multiple spaces in the modified file name are removed to produce a set of single words. Accordingly, the file name is further modified to 20080915 PROJECT SCORPION BUSINESS CASE ID55 V2 34 REV F 2

S.

Moving on, one or more digits present in the file name may be optionally removed to

[20080915

]

_

XL

1

[This page contains 2 pictures or other non-text objects]

Page 2 of 3

further modify the file name. Subsequently, one or more single character words from the set of words may be removed. As a result, the file name is refined as PROJECT SCORPION BUSINESS CASE ID55 REV XLS.

In an instance, one or more words which belong to a predefined set of special words may also be removed from the set of single words in the file name. For example, the predefined set of special words may include one or more of JAN, JANUARY, FEB, FEBRUARY, MAR, MARCH, APR, APRIL, MAY, JUN, JUNE, JUL, JULY, AUG, AUGUST, SEP, SEPT, SEPTEMBER, OCT, OCTOBER, NOV, NOVEMBER, DEC, DECEMBER, V, VER, VERSION, REV, REVISION, and ITERATION.

Taking into account the above steps, the file name with final modification is PROJECT SCORPION BUSINESS CASE ID55

XL

                          . In accordance with the method, the fi...