Browse Prior Art Database

A Method and Apparatus for Exploration of Hyper dimensional Data

IP.com Disclosure Number: IPCOM000239795D
Publication Date: 2014-Dec-02
Document File: 8 page(s) / 631K

Publishing Venue

The IP.com Prior Art Database

Abstract

A system, method, and tool for exploration of hyper dimensional data is disclosed.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 24% of the total text.

Page 01 of 8

A Method and Apparatus for Exploration of Hyper dimensional Data

Disclosed is a system, method, and tool for exploration of hyper dimensional data.

A tool which enables users to explore high dimensional data by combining a multidimensional scaling display with a parallel coordinates display of the same data enables users to explore high dimensional data. The disclosed tool allows users to derive insight into the clustering and outlier behavior of the provided high dimensional data points. Users can explore the characteristics of subsets of the data by using brushing tools on the linked data in either display, and can manipulate which dimensions are in force, and observe an animation of how these changes impact the proximity of data points to one another.

Multiple dimensions are hard to understand, impossible to visualize, and, due to the exponential growth of the number of possible values with each dimension, complete enumeration of all subspaces becomes intractable with increasing dimensionality. Nevertheless, high dimensional data is quite prevalent, and must be dealt with. High dimensional data comes up in every situation that can be represented by a spreadsheet

with rows for each data item, and multiple columns representing the attributes of those items. Discovering patterns or regularity's in the data, discovering clusters of similar data items, and outliers that are different from all the other data are important for understanding the information provided. These aspects are helpful for coming to an understanding of what the data means and is a common analysis task in business, medicine, and science. Over the years, a variety of solutions have been put forth to deal with high dimensional data. Many existing solutions tend to approach exploration of the data by supporting exploration of the data in a selected small subset of the available dimensions. One approach, scatter plot matrices, actually produces visualization's of all possible pairs of dimensions. As the number of dimensions grows, however, these approaches become impractical.

Other approaches (such as radar plots, or star coordinates) attempt to project high dimensional data down to two or three dimensions. These also become impractical as the number of dimensions grows large. A few techniques exist which find ways of representing high dimensional data in two dimensions such as heat maps, parallel coordinate displays, and dimensional stacks. However, these approaches tend to obscure the occurrence of clusters and outliers. One promising approach, multidimensional scaling, maps points in a high dimensional space into a two or three dimensional space by computing the dissimilarities between points in the high dimensional space, and treating these as distances in a lower dimensional space. The multidimensional scaling approach attempts to lay out the points as with the same distance relationships to each other as they had in the high dimensional space to the extent pos...