Browse Prior Art Database

Simple Web Indexing Facility

IP.com Disclosure Number: IPCOM000117706D
Original Publication Date: 1996-May-01
Included in the Prior Art Database: 2005-Mar-31
Document File: 2 page(s) / 47K

Publishing Venue

IBM

Related People

Deri, L: AUTHOR

Abstract

The World Wide Web does not provide a way to specify semantic content in HTML in order to perform automated queries. Different approaches have been tried to overcome this limitation. One approach was to enhance HTML with new keywords (1), but this introduced incompatibilities with existing Web tools. Another approach was to create tools that read index files and then generate HTML on the fly (2).

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 62% of the total text.

Simple Web Indexing Facility

      The World Wide Web does not provide a way to specify semantic
content in HTML in order to perform automated queries.  Different
approaches have been tried to overcome this limitation.  One approach
was to enhance HTML with new keywords (1), but this introduced
incompatibilities with existing Web tools.  Another approach was to
create tools that read index files and then generate HTML on the fly
(2).

      A new approach to indexing files is to use the HTML comments
that are enclosed between <! and > tags.  Relevant information is
inserted inside the comments tags; the comments are not displayed by
the web browsers.  Thus, standard commands such as "grep"  and "awk"
can be used to search relevant information only inside the comments.

      This solution has been used in the "GDMO/ASN.1 Search Engine"
where off-line compilers generate index files directly in HTML
format.  Every line of the index file is an entry.  It contains the
indexing information enclosed in the comment tags.  The rest of the
line contains the HTM anchor relating to the information that
corresponds to the index entry.  The advantage of this solution is
that standard tools can search the information in the index file and
then directly display the search results that are already in HTML.
An index entry could be for instance:
  <bb><!ObjectIdentifier!><!system2.9.3.2.3.13!><IMG ALIGN=absbottom
  BORDER=0 SRC=/icons/ball.gif><i> <A
  HREF="/Gdmo/da...