Browse Prior Art Database

Language Definition of a Text Query Language

IP.com Disclosure Number: IPCOM000105801D
Original Publication Date: 1993-Sep-01
Included in the Prior Art Database: 2005-Mar-20
Document File: 4 page(s) / 225K

Publishing Venue

IBM

Related People

Edwards, DR: AUTHOR [+2]

Abstract

This invention relates to a process for accessing, excerpting, annotating and indexing online documents. Text Query Language (TQL) is an application programming interface (API) for "text-dominated" or unstructured text data bases. In ISO Programming Languages terminology, TQL is a "Call Level Interface" [7] for text access applications retrieving data stored in relational data bases (through the SQLSELECT TQL command) or non-relational text repositories.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 22% of the total text.

Language Definition of a Text Query Language

      This invention relates to a process for accessing, excerpting,
annotating and indexing online documents.  Text Query Language (TQL)
is an application programming interface (API) for "text-dominated" or
unstructured text data bases.  In ISO Programming Languages
terminology, TQL is a "Call Level Interface" [7] for text access
applications retrieving data stored in relational data bases (through
the SQLSELECT TQL command) or non-relational text repositories.

      In a TQL data base architecture, a repository of textual and
quantitative data, is separated from the TQL search engine and user
interface components of the system, as shown in the figure.

      In the course of exploring a text corpus, users employ TQL to
create and retain a structure map (a TQL Index) of the corpus for
personal, commercial or scholarly purposes.  TQL Index elements are
"Keyword Sets", "Categories", "Values" of "Units of Analysis", "Strip
Associations", "Attributes" and their "Ratings" and "Annotations".

      A TQL data base is distinct from relational data base
technology [8]  for the following reasons:

o   No granularization criterion for a text corpus (chapter, block,
    paragraph, sentence, etc.) is inherently suited to retrieval
    needs, while also being flexible and efficient.
o   Conversion of existing non-relational text corpora to relational
    format is costly.
o   Chopping a text corpus into chunks for data base fields may
    adversely impact the form of the document as a social artifact
    [6].  Enterprise document libraries based in SGML [5]  are
    becoming increasingly common as large organizations attempt to
    standardize their soft copy text data formats.  SGML tagged
    documents are compatible with TQL in concert with suitable tag
    resolution filters.

A TQL Project Statement is a file receptacle for:

1.  names of documents associated with the Project;
2.  names of Keyword Sets, Categories; Units of analysis and their
    Values;
3.  pointers for assignments of strips to Categories or to Values of
    Units of Analysis;
4.  semantic relationships between Categories;
5.    semantic associations between strips secondary to any
    categorization;
6.  annotations of Category and Unit of Analysis names;
7.    annotations of strip assignments to a Category or a Value of a
    Unit of Analysis;
8.  annotations of strip associations;
9.  Rating Values associated with Category Assignments or Strip
    Associations;
10. TQL Indices and Views on specific documents; and
11. names and locations of TQL Indices linked from other Projects.

      A TQL View Statement restricts reference to a subset of the
Project documents, particular column ranges, or any arbitrarily
defined subset of the Project documents.  Creating a View is one way
of retaining a query result.

      A TQL Index Statement is not a rel...