Browse Prior Art Database

Methodology for Crawling Client Side Search Engine Validation for Browser Based Applications

IP.com Disclosure Number: IPCOM000028968D
Original Publication Date: 2004-Jun-09
Included in the Prior Art Database: 2004-Jun-09
Document File: 2 page(s) / 32K

Publishing Venue

IBM

Abstract

This article covers client side search engine validation for browser based applications.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 2

Methodology for Crawling Client Side Search Engine Validation for Browser Based Applications

The main idea is the return of stale Uniform Resource Locators (URLs) through search engines to a web browser and having to visit several possible stale links in a row before a valid link is found. No matter what search engine is used, there are bound to be several links returned that are not valid and return an error or page not found when a user visits the link. Search Engines build databases of World Wide Web pages through the use of registration based databases, or crawlers and deep crawlers. Databases have to be reindexed at various times to rid itself of invalid pages, or stale pages. The time to rebuild large databases can take a while. Reindexing of valid web pages are not very frequent, thus it is possible that within a few days, pages may go stale and the search engine will return stale web links. When the user follows a returned link is when the web page is found to be stale. Then the user has to visit the next link, and it might be stale, and so on.

The core idea relies on a search engine to return the raw results to a browser based application, where the browser based application stores the results in a local database, either disk or memory resident and then performs validation of the links and returns the first valid link on a failed operation. This concept would also require the use of a raw result return search engine. How the raw result return search engine is implemented is not important, only that it returns the raw results to the browser based application side database and stores the results locally.

The browser based application stores the raw results of a search engine query in a local database, either disk or memory resident and then performs validation of the links and returns the first valid link on a failed operation.

To implement, the search results page would be displ...