Browse Prior Art Database

World Wide Web Search Architecture

IP.com Disclosure Number: IPCOM000016019D
Original Publication Date: 2002-Aug-15
Included in the Prior Art Database: 2003-Jun-21

Publishing Venue

IBM

Abstract

Problem Statement One of the most common methods to locate information on the World Wide Web is to use a search engine . [Other methods include those based on pre-organised and often human-maintained classification into catalogues or directories.] Search engines typically use a brute force web crawling technique; identifying web sites, traversing hyperlinks, retrieving web pages, and generating index meta data. However, the effectiveness of search engines is decreasing and will continue to decrease over time due to the following 2 key factors: The volume of information (new and modified) to be indexed is increasing rapidly this increases the delay between new web pages being made available on the web and their availability via search engines, increases network bandwidth requirements, and increases the storage and processing power requirements of search engines.