Browse Prior Art Database

System and method to persist and reuse of runtime knowledge in ETL jobs to avoid deadlocks and improve runtime performance

IP.com Disclosure Number: IPCOM000243959D
Publication Date: 2015-Nov-02
Document File: 2 page(s) / 53K

Publishing Venue

The IP.com Prior Art Database

Abstract

Persistence of Score for running ETL jobs Disclosed is a method for persisting and re-using of run time knowledge in running ETL jobs using Data Integration tools.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 60% of the total text.

Page 01 of 2

System and method to persist and reuse of runtime knowledge in ETL jobs to avoid deadlocks and improve runtime performance

Problem Description
1) The execution of an ETL job involves the first step of making a plan of how the job has to be executed, like how many processes, how many nodes, which of them on which nodes, and connections between them, etc., which we call as score in this document. The generation of score takes considerable amount of time, because it involves taking good number of decision making tasks like insertion of partition or sort operators etc., and this is done at the start of the job every time it is run. Our solutions proposes to save time spent in generating score every time the job is run, by saving the score once generated, and reusing it until it gets invalidate.

2) In our work at [*], we have proposed a solution to avoid deadlocks and slowdown in the run time of ETL jobs, by introducing a buffer operators when a deadlock/slowdown is detected at the time of running jobs, which is done every time a deadlock or slowdown happens, so all the time spent waiting for the insertion of buffer operators to recover deadlock/slowdown is wasted. If we can persist this information about the injection of buffer operators into the score, and reuse it every time the job runs, we need not have to wait for the deadlock to occur and introduce them again.

Our solution helps saving the over-all ETL job run time in both of the above situations.

Solution:-


Ge...