Browse Prior Art Database

Combine Operations On Unstructured And Semi Structured Data In Big Data Environments Disclosure Number: IPCOM000236405D
Publication Date: 2014-Apr-24
Document File: 2 page(s) / 39K

Publishing Venue

The Prior Art Database


Disclosed is a solution to better manage and analyze Big Data. This system allows customers to combine data from structured and massive unstructured data by bringing in data from different data sources and displaying the results in an intuitive user interface.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 01 of 2

Combine Operations On Unstructured And Semi Structured Data In Big Data Environments

Disclosed herein is a method for combining structured and unstructured data allowing customers to dynamically associate data type to data that exists in Distributed File System (DFS) and bring in structured data and data from multiple data sources.

Big Data is data held by companies and organizations in great volume, wide variety, and high velocity. To maximize the data that is stored, methods of analysis are applied to examine Big Data to uncover patterns, correlations, and other useful information that can identify trends, insights, and provide a company with a competitive advantage in the market. This analysis has created the need for a new class of capabilities to augment the current methods to provide a better line of site and control over existing knowledge domains and the ability to act on that knowledge.

The variety of data especially presents new challenges. For example, with the explosion of sensors, smart devices, and social media websites, data in enterprise is increasingly complex. The same analytical capabilities over traditional relational data must also be performed on data in raw, semi-structured, and unstructured formats. Current systems are limited when it comes to performing the analytics over a variety of data.

Relational database systems allow for join operations on structured data defined in database tables, but this solution does not scale in large data and diverse data variety environments.

Embodiments of the present invention provide a way to mine information from terabytes of data that is mostly unstructured and combine it with structured data based on business rules and needs. This facilitates combination operations over unstructured, semi-structured, and structured data through an intuitive user interface (UI).

In an embodiment of the present invention, in order to facilitate a combining process, a method brings in data from different data sources and then displays the result in the UI. The user does not need to write any code to perform this analysis...