Browse Prior Art Database

Ability to assist DataStage developer while designing DataStage Job with the next possible Stage to use.

IP.com Disclosure Number: IPCOM000236194D
Publication Date: 2014-Apr-11
Document File: 2 page(s) / 50K

Publishing Venue

The IP.com Prior Art Database

Abstract

A complex DataStage Project often consists of many DataStage Jobs which are mostly designed by multiple developers. During the process of Job designing, DataStage Job developers end up creating similar, if not same, Job Stages with almost the same configuration details, say a XML Stage configuration with parser and parameterized XPATH query, or a Build-Op stage with embedded C/C++ execution code. Given the fact that the pre-configured stages can be re-used by publishing them to the DataStage Server/Parallel Shared Containers, but the DataStage Developer has to find out right shared container to use, from a collection of Shared Containers across projects. This becomes more complex to choose the right Stage when there are 1000's of Job's and large number of DataStage Projects in a given Information Server DataStage installation.

It would ease the DataStage developer if they are assisted automatically on what could be next possible Stage to use while designing the Job. For example in Job1 of Project1, if the developer has configured an XML Stage to parse financial data, the same configured XML stage can be used in another job, Job2, of Project2, without searching for it and automatically be assisted to the Developer while designing the Job2.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 01 of 2

Ability to assist DataStage developer while designing DataStage Job with the next possible Stage to use.

Algorithm for design implementation

A repository data model containing the possible stage types for a given stage. For example, for a BuildOp stage, the repository data model would store the map of all possible stages that can be linked to BuildOp. A repository model, containing a map of used Stages across Jobs and across Projects in the Source as the key and used Stages as parsed from the Output links from all Jobs as the Value(s). The values are ranked based on number of times the Stage is used as the Output link.

For example, for Sequential File Stage, if Java Integration Stage is used as output link 54 times across

jobs/projects, and BuildOp is used 32 times, and peak is used 18 times then Java Integration Stage is ranked 1, followed by BuildOp stage with Rank 2, and Peek with Rank 3. The Stage with Rank 1 is most probably would be used by Developer and is shown as the first item in a drop down or the UI screen as the next stage, followed by other stages in the with rank in order.

The repository can be properties file(s) as well, storing the map of Stage

*Possible Stages. During DataStage designer launch, this data model is fetched from the repository and cached it locally, for a quicker access.

Details


Depending upon the Current Stage metadata like the stage type and its possible stages that can be linked, a component which runs as a background thread within the Job designer, would search for matching po...