Browse Prior Art Database

Search result ranking method using access history and access control list

IP.com Disclosure Number: IPCOM000034139D
Original Publication Date: 2005-Jan-18
Included in the Prior Art Database: 2005-Jan-18
Document File: 2 page(s) / 22K

Publishing Venue

IBM

Abstract

Disclosed is a method for calculating document relevancy score in information retrieval systems. The score calculation method is based on documents' access history and its access control list. The method can be used with traditional term frequency-dependent score calculation method to improve document relevance score.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 2

Search result ranking method using access history and access control list

On most conventional information retrieval systems, term frequency-dependent method or web-specialized document reference count-dependent method is used as a document relevance score calculation method. These methods work fine for public web-based search systems. But on corporate search systems of which the search target is composed some different middleware (groupware, relational database management system and etc.), these methods can not be used or may calculate biased score resulting from document characteristic difference between each middleware. The proposed method offers the additional way the search system calculates document relevancy score by introducing access count-dependent weights of documents.

Many of information retrieval systems perform two operations, called Crawling and Indexing. Crawling is to gather text data from search target middleware, and Indexing is to create indexes for search from the gathered data. These two operations are performed periodically to reflect changes of search target middleware. On the other hand, middleware, such as relational database management system and groupware, have access control lists to restrict document access to users and log facility for recording document access history of users.

In general, a document which is accessed many times and is accessed by many users can be regarded as an important document. Therefore, access count-dependent document relevancy score can be calculated by combining following two characteristics.
1) Information retrieval systems perform Crawling periodically.
2) Search target middleware have access log facility and access control lists.

Document relevancy score S of document D, which is contained by search target system T, can be calculated as S(D) = sum[n(D)/n(T)]/N, where sum[n(D)/n(T)] is the sum total of each user's n(D)/n(T), and n(D) is a user's access count to D during Crawling interval, n(T) is...