Browse Prior Art Database

METHOD FOR DATA MINING THE STATISTICS OF AN INTERNET YELLOW PAGES WEB SERVICE

IP.com Disclosure Number: IPCOM000013561D
Original Publication Date: 2000-Jun-01
Included in the Prior Art Database: 2003-Jun-18
Document File: 3 page(s) / 80K

Publishing Venue

IBM

Abstract

Data Mining an Internet Yellow Pages Web Service

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 44% of the total text.

Page 1 of 3

  METHOD FOR DATA MINING THE STATISTICS OF AN INTERNET YELLOW PAGES WEB SERVICE

Data Mining an Internet Yellow Pages Web Service

Disclosed is the creation and use of a data mining technique for an internet yellow pages web site. It is described here as a solution for yellow pages, but the method can apply to other industries as well.

A yellow pages publisher or owner of an internet yellow pages web service, needs information pertaining to the use of the site often referred to as "statistics" for billing. Some typical yellow pages statistics include: number of times a particular category is selected for a search; number of times during a specific time period that a specific business listing is accessed or accessed in a specific location or accessed with a particular category.

Traditional web servers such as Lotus Domino Go Webserver log all incoming requests (from a web browser) into what is referred to here as a web server access log. This log contains the component on the web server that was called and the parameters that were given to it. When a request for information is sent from an internet client to a web service, the request is logged. While the web server access log provides a map of all requests that come into the web site, and requires no special code to be written in the web service for an entry to appear, it does not contain sufficient information for the detailed statistics required by a yellow pages solution.

Using a web server access log to generate site usage statistics is a common practice in the industry today. The presently-disclosed method takes advantage of this practice. But it adds value where common use of this log falls short. It provides the level of detail needed to accurately track the use of a web service back to the data it provides (in the database). The result is information adequate for billing advertisers and altering content to attract and retain internet users. Additionally, it does so without impacting the performance or maintainability of the web service. The method described in this article uses a combination of the following techniques described below.

INTERNET REQUEST PARAMETERS

A straightforward mapping of the internet request parameters to database table fields can be done using the web server access log. However the internet request parameter values must match those in the relational database table where the actual yellow pages data resides. Additionally, all the internet request parameters follow the same format and convention, so they can be easily dissected by a data mining tool. Each internet request to the web service contains an identifier followed by a series of parameters for this specific identifier. For example, for a request to return the yellow pages listings of all the "Italian Restaurants" in the telephone directory of "Boca Raton" Florida, the request looks like:

"ID=3100&PARMS=12100940,581201"

Where " ID=3100 " indicates a request to return listings. The first parameter val...