Browse Prior Art Database

Automatic Discovery of Spatiotemporal Relationships within Labeled Data

IP.com Disclosure Number: IPCOM000233858D
Publication Date: 2013-Dec-24
Document File: 3 page(s) / 24K

Publishing Venue

The IP.com Prior Art Database

Related People

Bart Thomee: INVENTOR [+3]

Abstract

A method for automatically discovering spatiotemporal relationships within labeled data is disclosed. The spatiotemporal relationships include one or more of, but not limited to, equivalence, containment, adjacency and periodicity. The method includes analyzing similarities between spatiotemporal footprints of labels. The spatiotemporal footprint may be defined as minimum area occupied by a label in terms of spatial boundary, time-series of occurrences within the spatial boundary and data density within the spatial boundary.

This text was extracted from a Microsoft Word document.
This is the abbreviated version, containing approximately 43% of the total text.

Automatic Discovery of Spatiotemporal Relationships within Labeled Data

Abstract

A method for automatically discovering spatiotemporal relationships within labeled data is disclosed.  The spatiotemporal relationships include one or more of, but not limited to, equivalence, containment, adjacency and periodicity.  The method includes analyzing similarities between spatiotemporal footprints of labels.  The spatiotemporal footprint may be defined as minimum area occupied by a label in terms of spatial boundary, time-series of occurrences within the spatial boundary and data density within the spatial boundary.

Description

Disclosed is a method for automatically discovering spatiotemporal relationships within labeled data.  The spatiotemporal relationships include one or more of, but not limited to, equivalence, containment, adjacency and periodicity.  The method includes analyzing similarities between spatiotemporal footprints of labels using region based and distribution based data.  The spatiotemporal footprint may be defined as minimum area occupied by a label in terms of spatial boundary, time-series of occurrences within the spatial boundary and data density within the spatial boundary.  The label is any token that includes one or more of a spatial value and a temporal value.  The method identifies geographical extent and temporal period associated with two or more labels based on relationship between the two or more labels.

In accordance with the method and system, geographical and temporally relevant regions associated with each label are generated initially.  Thereafter, distribution of label occurrences around the world is modeled.  The modeling of label occurrences may be performed by clustering different areas where data associated with a certain label occurs. Data within each clustered area is modeled.  For example, mean-shift, k-means or Density-based Spatial Clustering of Applications with Noise (DBSCAN) may be used to separate the data into disjoint clusters, whereas kernel density estimation or Gaussian mixture decomposition may be used to model the data clusters.  Further, global bivariate distribution of each label as a mixture of Gaussians is determined.  The global bivariate distribution is decomposed into its Gaussian constituents until only a small residual quantity of noisy/outlier data instance is left.  This assists in condensing data into a representation made of a small set of Gaussian distributions and substantially decreases the required memory for footprint representation and comparison.  Considering that Gaussian mixture decomposition is typically used to separate different data sources from each other, the method effectively compresses the occurrences of millions of data points into a handful of Gaussian mixtures wherein each Gaussian mixture is represented by a mean vector and a covariance matrix.      

The generated regions and distribution of label occurrences enables defining spatiotemporal...