Browse Prior Art Database

Dynamic Control over Operator Interconnection and Data Routing in Stream Computing Applications

IP.com Disclosure Number: IPCOM000238360D
Publication Date: 2014-Aug-20
Document File: 6 page(s) / 118K

Publishing Venue

The IP.com Prior Art Database

Abstract

Implementation patterns that allow to change the paths data can flow in stream computing applications by means of control commands issued during runtime.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 22% of the total text.

Page 01 of 6

Dynamic Control over Operator Interconnection and Data Routing in Stream Computing Applications

Disclosed is a framework-like collection of implementation patterns. Their purpose is to allow for a highly flexible implementation of stream computing applications. The idea is to issue control commands during runtime in order to influence the processing logic structure instead of modifying the source code and recompile or even restart the application. To date, stream computing applications usually consist of functional building blocks that can be "plugged" together to form an overall streaming application. The functional blocks fulfill tasks like e.g. transforming, enriching, joining or aggregating the data that flow through them. The way, the

different building blocks are connected with each other define the paths, the data can flow at all. This principle is more or less the same in any modern stream computing systems, e.g. Apache's "Storm"*) or IBM's "InfoSphere Streams"**). This article uses terms and idioms derived from the documentation of InfoSphere Streams1), but the disclosed idea is not limited to that specific product and also applies for any other state of the art stream computing system. The problem is that in stream computing applications the different functional blocks (called "operators") usually are interconnected by predefined "streams" to form the overall data flow (called "graph") in a relatively static manner. The graph defining the paths, i.e. the streams on which data can potentially flow between different operators, cannot be changed during runtime. Whenever the structure of the data-flow graph is supposed to be modified, it is necessary to change the source code, recompile and restart the application. The solution to this problem is to prepare all data streams between any operators in the application that might ever be useful and implement a control stream that can enable or disable each of these paths individually as needed at any time, while the application is running. One can think of this like a map of roads between several cities. The roads (streams) are generally available (defined in the graph) and always connect two cities (operators) with each other, but each road can be closed completely (operator implementing switch functionality) or just for specific cars or group of cars (operator implementing dynamic filter functionality) by the police (special control stream). And even a police car (control command as extension to the data stream) driving down the road (stream) can initiate a road to be closed or opened when broadcasting the command via the radio (when extracted by an operator and injected into the control stream).

Control Stream Injection Point (CSIP)

The basis is the introduction of a global control stream. It has to be commonly connected to any pattern specific operators described below and uses a generic protocol based on a common control schema (CTRL). An example for a suggested schema and...