System and Method for Accelerating Spatial Operations
Publication Date: 2015-Feb-12
The IP.com Prior Art Database
This article describes an approach how to store geometries in tables in a relational database system where the size of individual geometry values exceeds the maximum column size of the available data types. The idea is to split the geometries on a binary level, ignoring the spatial and topological properties of the values. The split values are stored in a separate table as a sequence of rows. Subsequent processing on original geometries requires a recomposition, which is transparently achieved and encapsulated by a view.
Page 01 of 10
With the event of Big Data, managing the data volume in enterprise data warehouses (DWH) with very good performance has become a major factor for DWH design. Various different approaches are in use today, and often a combination of one or more of the following techniques is employed today:
• exploiting accelerators to seamlessly combine a system specialized for transactional workload with one for analytical workload,
• in-memory database systems, • different organization for the physical storage of table data (e.g. column stores),
The specialization of relational database systems for analytical workload using these techniques usually comes with restrictions in terms of supported data types and functions. This is due to the design decisions made during implementation. For example, handling XML documents or large objects (BLOBs, CLOBs) is not available due to the complexity and size of values of those data types, which require different storage mechanisms that are fundamentally different from, e.g. column-stores, or the data volume simply does not fit in memory. For example, the IBM DB2 Analytics Accelerator  - which uses Netezza as its analytics-optimized processing backend - limits all values in rows in a table to about 64K. The specific reason here is that each row must fit on a single data page in the system, and the page size is limited to 64K. No support for XML, BLOB, CLOB, and spatial data types is available in IDAA. Associated with the limitation for the data types, a corresponding limitation of functionality exists. For example, if no spatial data values can be stored by the system, there is no point in supporting spatial functionality like testing whether two polygons overlap in space or if two line strings (e.g. representing roads) cross each other.
Nevertheless, users of database systems expect support for spatial data in analytical workload. Thus, it is important for vendors to provide means - without compromising the architecture of the specialized systems so that performance will not be sacrificed.
Some systems have invested ways to deal with spatial data that does not fit into its limitations (64K page size). The approach breaks down too-complex geometries (too long linestrings, multi-linestrings, or too large polygons and multi-polygons) into multiple sub-geometries. Those sub-geometries are in themselves consistent and stored in the "side table". A 1:n relationship with the "original user table" is established so that is known, which sub-geometry belongs to which row in the original user table. The following schema definition illustrates this:
-- user table
CREATE TABLE country (
id INTEGER NOT NULL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
abbreviation VARCHAR(3) NOT NULL,
-- other attributes
Page 02 of 10
-- side table
CREATE TABLE country_shape (
country_id INTEGER NOT NULL FOREIGN KEY REFERENCES
A multi-polygon describing the exact ar...