Using a pattern-matching query language to locate a row, cell, or other control in a spreadsheet
Publication Date: 2010-Dec-16
The IP.com Prior Art Database
Tabular data can be found in media such as spreadsheets, web pages, and databases. However, of these only databases store the computer-readable metadata to locate information semantically. This article describes a technique for determining this metadata from context. This context includes the structure and presentation of the data, and the structure of any queries made on that data. For example, consider a spreadsheet with multiple tables located within it. Using a query one could determine the location and extent of the table that might contain data that is of interest as well as the location of the data itself within that table.
Page 01 of 2
Using a pattern-matching query language to locate a row , cell, or other control in a spreadsheet
Consider a spreadsheet containing data in multiple tables as in figure 1. Since a spreadsheet is a formatted, human-readable document, these tables may be placed in a location for visual, aesthetic reasons. As far as data formatting is concerned, the positioning of the tables is arbitrary. It is obvious to a human where the tables are but not to a computer.
Figure1: A spreadsheet containing multiple tables
(This page contains 00 pictures or other non-text object)
Ideally, we want to run queries like these (these are in pseudocode) against the tables in figure 1:
Find the Surname where Party is 'Tory' --> 'Perceval'
Find the Surname where Party is 'Tory' and Constituency is 'Finchley' --> 'Thatcher'
Find the Surname where End Date is '' --> 'Brown'
Set the End Date to '7 May 2010' where Surname is 'Brown'
This can be done if we teach a computer to glean the right information from context.
We determine known information about the table from the query. In our first query, we know that we are looking for columns called Surname and Party. We can search through all rows in order for candidate header rows containing these labels.
Note that, by transposition, everything we discuss about columns can be applied to rows (and vice-versa), so we can also search for row titles in a header column.
Next, we can query the rows (or columns) in each potential 'table' we have discovered...