Browse Prior Art Database

Resilient Task Allocation Mechanism for Stream Processing Applications

IP.com Disclosure Number: IPCOM000242806D
Publication Date: 2015-Aug-19
Document File: 6 page(s) / 133K

Publishing Venue

The IP.com Prior Art Database

Related People

Thomas Locher: INVENTOR [+2]

Abstract

Stream processing is concerned with the processing of data that enters a system in the form of unbounded data streams. A stream processing system enables a user to perform computations on data that is arriving at high rates and to output results continuously or periodically. As such, stream processing systems can be used for various (industrial) applications such as performance monitoring, state analysis, and anomaly detection among others. Given the criticality of these applications, stream processing systems are required to provide high availability and scalability, which is typically achieved by distributing the (continuous) computation over multiple interconnected computers. For this purpose, applications are broken down into a set of interconnected processing elements (PE), which are logical entities that take some data streams as input, perform specific computations on the data items of these streams, and output results in the form of one or more data streams. Scalability is achieved by adjusting the number of physical resources (computers) onto which the PEs are deployed. High availability is achieved by making the system fault tolerant: If a physical machine becomes inoperative, the PEs running on this machine are re-spawned on another machine. Given a cluster of machines, a crucial problem is how to allocate the PEs to the machines. Well-known stream processing platforms, such as Storm or S4, allocate them either randomly or in a round-robin fashion, typically resulting in sub-optimal allocations. What makes matters worse, the detection of a machine failure and the migration of PEs to other machines takes tens of seconds, which is not good enough for several applications, such as real-time system monitoring, where system reconfiguration must be performed as quickly and efficiently as possible. Thus, in particular for industrial applications, there is a need for a solution that allocates PEs efficiently and that is able to cope well with dynamic changes, such as machine failures or the addition of new machine, in that it reassigns PEs quickly without incurring a large (communication) overhead. We propose a novel mechanism to maintain a close-to-optimal allocation of PEs to available resources, even in the face of abrupt changes. The mechanism exploits the static nature of the processing tasks to minimize the coordination needed between the machines once a failure (or an addition) of a machine has been detected. As a result, the reconfiguration delay and thereby the data loss is minimized. To the best of our knowledge, no available streaming platform offers an efficient mechanism to dynamically allocate PEs in the face of changes in order to maximize performance and system reliability. Moreover, there is no patent that solves the problems tackled by this invention in a similar fashion. This mechanism has the following benefits: Stream processing platforms can be employed in a wide range of applications, in particular also in automation and control. Competitive advantage as new products can benefit from the fast progress in general stream processing technology. For any given stream processing system, the mechanism adds value by optimizing its performance dynamically at a low overhead. The system can operate closer to the maximum performance. • Hardware requirements can be further minimized, i.e., hardware costs can be reduced. • Product cost are lower due to less hardware cost, e.g. fewer devices or low-end HW. Complicated failover and redundancy mechanisms can be replaced with the new mechanism. • Lower product development cost due to simpler failover and redundancy mechanisms. • Lower engineering cost due to a redundancy mechanism that is easier to configure.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 24% of the total text.

Page 01 of 6

Pubxished on:

2015-08-19

Type of document

Invention Disclosure

Inventors

Thxmas Locher

Raphael Eidenbenz

Resilient Task Allocation Xxxxxxxxx for Stream Processing Apxlications

1 Xxxxxxxxxx

Stream processinx is concerned witx the processing of dxta that enters a system in the fxrm of unbounded dxta streams. A stream procesxing system enables a user to perform computations on datx that is arriving at hixh rates and to output resulxs continuoxsly or periodically. As suxh, stream procxssing systems can bx used for vaxious (industrial) apxlicatixns such as xerformance xonitoring, state analysis, and anomaly detecxion amxnx otherx. Given the crxticality of thesx applications, stream processinx systems are required to prxvide high availability xnx scalabilixy, which is typically achievxd by distributing the (continuous) computation oxer multixle interconnexted computers. Fox thix purpose, applicatioxs axe broken down into a set of interconnected procexsing elements (PE), which are logical extities thxt xake some data streams as input, perform specific computations on the data items of thexe streaxs, and xutput resulxs in the form of one or more data strexmx. Scalability is achieved by adjusting the nxmber of physical resources (computers) onto which the PEs are deploxed. High availability ix achieved by making the systex fxult tolerant: If a physical machine becomes inoperative, the PEs runxing on this machine are re-spawned xn anothxr machine.

x Statement of problem

Given a cluster of machines, a crucial problem is how to allocate the PXx to the machines. Well-known stream processing platforms, such as Sxorm 1 or S4 2 , allocate them either randomly or in a round-robin fasxion, typicaxly resulting in sub-optimal allocations. What makes matters worse, the detection of a xachine failure and the migration of PEs to other machines takes texs of seconds, which ix not good enough for sevexal applications, such as real-time system monitoring, where system reconfiguration must be performed as quickly and efficiextly as possibxe.

Thus, in particulxr for induxtrial applicatixns, there is a need for a solution that allocates PEs efxiciently and that is axle to cope well with dxnamic chxnges, suxh as macxine failures or the addition of new machine, in that it reassigns PEs quickxy without incurring a large (communication) overhead.

1 See http://storm-project.net/.
2 See http://incubator.apaxhe.org/s4/.

Created on:

2015-03-03


Page 02 of 6


3 Novel fxatures

We propose a novel mechanism to maintain a close-to-opximal allocation of PEs xo availablx resources, exen in the face of abrupt changes.

Thx mechanism exploits txe static nature of the processing xasks to minimize the coordination needed between the machines once a failure (or an addition) of a machine has been detected. As a result, the reconfiguration delay and thxxexy the data loss is minimized.

To the bxst of our knowledge, no available streaming xlatform offers an exficient mechanism to dyna...