Browse Prior Art Database

A method of improving search precision based on documents selected by user.

IP.com Disclosure Number: IPCOM000032013D
Original Publication Date: 2004-Oct-19
Included in the Prior Art Database: 2004-Oct-19
Document File: 3 page(s) / 78K

Publishing Venue

IBM

Abstract

The proposed new method allows improving search precision by collecting features from the documents that seem to be relevant for the user. Thus, the new method provides the user with a simpler and yet more efficient control over the search refinement process without requiring any special skills in formulating the query or understanding the categories associated with the desired information.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 3

A method of improving search precision based on documents selected by user.

Search engines provide 2 basic methods that allow to refine search, increasing search precision and selectivity: (1) modifying existing search query by adding/modifying query terms, or (2) narrowing down the search scope by searching within a narrow category or domain. Both methods are widely used by multiple Web search engines (see Google.com, About.com, etc.) as well as corporate search engines (see Ibm.com/support, Microsoft.com). Most of the search engines use predefined categories to narrow down the search domain. More sophisticated search engines (see Vivisimo.com) allow a user to select search results associated with one of the dynamically identified categories, utilizing run-time results clustering technology.

The drawback of the first method is that it often requires deep knowledge of the search subject, and/or experience in formulating complex queries. Second method assumes understanding of underlying concepts/categories, and limits possible refinements to the number of predefined or dynamically identified categories or domains. These drawbacks significantly limit the ability of general users to find exactly what they need, resulting in lower level of user goal attainment.

As an example, consider a user who wants to find mathematical description and programmable algorithm of the Monte-Carlo Method widely used in statistical modeling. The user submits the 1st query - 'monte-carlo method' and gets a plenty of results (about 299,000 results in Google). At the next steps, the user needs to add query terms (in Google or others) or look through several different nodes in the category tree (in Vivisimo.com) to find relevant documents among the search results. The user may try many different additional terms, like 'theory', 'description', or 'algorithm', until the relevant documents can be found. After each attempt the user will need to open and read several documents to check whether the desired information is there. The success of the search, as well as the number of attempts, will depend on the user's ability to formulate appropriate query terms or find appropriate category that contains the desired documents. After the first relevant document is found, the user may want to see more documents that contain similar or more detailed information. Some search engines, like Google.com, allow to see so-called similar/related pages, but the similarity is defined in terms of the...