Browse Prior Art Database

Platform for capturing and processing multilevel (50+ levels, 1000+ nodes at each level) hierarchical and voluminous information using Hadoop clusters & NoSQL Databases

IP.com Disclosure Number: IPCOM000244730D
Publication Date: 2016-Jan-06
Document File: 8 page(s) / 85K

Publishing Venue

The IP.com Prior Art Database

Abstract

Complex hierarchies does exists across the domains and if such hierarchical data is enabled to be captured with quality, timeliness, high reliability and easeness so that every node level entities can participate in the data capture, it will help is making policy level decisions precise and accurate. (e.g. crop production in the given month, import/export decisions, health related, disease related etc). This disclosure deduce the way to define large scale complex hierarchies, data capture at each node in the hierarchy, storage and processing.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 24% of the total text.

Page 01 of 8

Platform for capturing and processing multilevel (50+ levels, 1000+ nodes at each level) hierarchical and voluminous information using Hadoop clusters & NoSQL Databases

Disclosed is a Framework for defining complex and large scale hierarchies, collecting variety of voluminous data from lowest entity in the hierarchies and processing it to get the various insights and patterns across the domain for the given hierarchy.

Consider below cases where the huge number of entities (a millions+) are structured along multiple levels of given hierarchy.

1. Agricultural produce cultivations at villages, town, region, district and states level using humans and/or sensors, IVR, SMS OR smart devices in the given period of time.

2. Various Surveys related to poverties, animals, government schemes, departments


3. Sensors networked with multilevel hierarchies

4. Accurate and near-to-real time rainfall related information at every villages, aggregated at towns, districts, states and country levels.

Enabling data capture, storage and processing for such complex hierarchies is a mammoth and complex task. If there exists such a platform, then it will help most of the government agencies or any other agencies for that matter to use such data for managing, defining various policy decisions (e.g. import / export related decisions) with better precision and accuracy.

Data capturing is at lower levels is a daunting and error prone task. Simplifying the data capture by means of mobiles (smart phones, older phones), mobile applications, web applications, IVRs, emails, sensors helps in enabling every entity at every node in the hierarchy level can participate in the data capture.

Processing of such data mostly involves aggregation of data ( e.g. SUM, COUNT) at each node, levels in the hierarchy. Map-Reduce jobs are excellent provides exact solution for such aggregation.

The platform disclosed here provides the details to build a comprehensive and reusable framework to capture, store and process data at each level in the multilevel (50+ levels) hierarchy.

The platform comprising of HDFS (Hadoop Distributed File System) to store the data captured, NoSQL databases to store the hierarchies and related configurations, Map-Reduce jobs to perform the aggregation computations.

HDFS is a distributed file system that provides the distributed data storages and processing using Map Reduce programming style.

NoSQL databases like MongoDB are used to store the hierarchy definitions. NoSQL database will allow to store multiple hierarchical definitions. It will also be used to store the control information like which hierarchies are computed at what stage.

1


Page 02 of 8

The Map-Reduce jobs will be developed to access and process the data available from each node. (E.g. computes the aggregation at each node, level, creates summaries at each node) and populates the summary data for each level in the HDFS.

The agencies(government, non-government) can publish the schemes (e.g. On...