Browse Prior Art Database

Data tagging optimization for heterogeneous dynamic streaming analytic graphs

IP.com Disclosure Number: IPCOM000248594D
Publication Date: 2016-Dec-20
Document File: 2 page(s) / 44K

Publishing Venue

The IP.com Prior Art Database

Abstract

Article presents method of optimization of data tagging for streaming applications that require dynamic reconfiguration and are bound to single communication channel.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

1

Data tagging optimization for heterogeneous dynamic streaming analytic graphs

When an application utilizes graph stream processing spanned (and dynamically split) across plurality of areas one needs to take care of synchronization. The most obvious way is to tag the data - so system knows to which topology it belongs.

Imagine a topology that has two regions: edge and cloud part, at the beginning we have ONLY Flow 1 defined and running. This flow is split to two parts, each executes within its own region. Regions are connected with single channel. When we want to dynamically (without stopping the system) update the topology we need to have, at the same time, for some duration, running both: "old" flow and "new" (old after changes). Since we have single channel, data needs to be marked or tagged as belonging to old or new topology - so we have full consistency of data processing. Multiplexer takes output of Flow1 and Flow1' and passes data to communication channel, then, on the other end of channel, as first processing step we have demultiplexer, which looks at tag in arriving data chunks and directs them either to Flow1 or Flow1' part.

This tagging mechanism is simple and robust but introduces overhead. This overhead has no use during "normal" topology work. It is used purely during dynamic topology change, which means for short amount of time and not very frequently. If we think of stream of data reading from sensors this may be significant overhead (up to 100% ).

In a topology as in example one can drastically limit the tagging overhead by having "learning" demultiplexer. This means that if data with tag arrives, it is processed normally - goes to corresponding output port. Additionally multiplexer records data rates on each of outputs. In normal "switching" scenario output one, being part of graph that is being quiesced, has a decreasing data rate. At the same time output 2 has increasing data rate. So, when tagless packet arrives multiplexer passes it to "more probable" output - the one that has bigger data rate. This mechanism can of course be tuned - such "auto-switching" may start working for example after certain time the other port is inactive or after processing of given number of data chunks.

For this mechanism to give any optimization "sending" part needs to stop tagging the data at some point - it may also be controlled by data rates on multiplexer inputs or after certain delay after source switching or any other, suitable algorithmic method.

Sending side. When a top...