Browse Prior Art Database

Conflictless simultaneous data gathering in distributed environment using asynchronous remote script executing.

IP.com Disclosure Number: IPCOM000234032D
Publication Date: 2014-Jan-08
Document File: 4 page(s) / 132K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a framework for gathering data in distributed environment based on remote script execution. It solves cocurrency issues when many data gathering applications are trying to execute the same script at the same time in a given file system location. A naming convention for file suffixes is proposed that is meangful (non-random) and allows for full script execution context recovery.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 01 of 4

Conflictless simultaneous data gathering in distributed environment using asynchronous remote script executing .

Problem description

The host configuration data (the operating system details or installed services) can be discovered using a technique called remote script execution. The process is depicted on figure 1.

Figure 1

The detailed steps are as follows:

The discovering application first prepares a script containing all the logic necessary to discover a server of known type (e.g. WebSphere Application Server).


The script is copied to the server over the network. Usually an SSH connection is

used.


The script is executed on the remote server and gathers the data to an output file

stored on the disk.


The output file is copied back over the network to the discovering server and then

further processed locally.

The described procedure works well in most cases, but there are some possible scenarios in which files can overwrite one another on the remote server. All threats are listed on figure 2.

Figure 2

1


Page 02 of 4

The issues #1 to #3 are related to multiplying the discovery processes, which can run concurrently on the same or different machines. The issues #4 to #6 are caused by multiplying services that in the end use the same file system. In each case the same file name may be used by different discovery processes at the same time, leading to:

general output/input failure


the script file overwriting

the output file overwriting


corrupted file content (mixed data from different sources)


inability to determine proper log file name (after system failure)

Known solution

The common solution is to use the randomly generated UUID as a part of script/output/log file names. The UUID is known to be unique, so it prevents files from overwrites. However this solution has the following drawbacks:

Every generated UUID is different (unique) - that means that generated UUID must be remembered in order to later create output or log file names. In other words: even knowing the environment in which UUID was originally generated it is impossible to regenerate the same UUID.


UUID is meaningless - it does not carry any additional information about the

execution context (e.g discovering host address).

Disclosed solution

The proposed solution is to use a file naming convention, that:


guarantees file name uniqueness

allows matching script files with its corresponding output and log files


enables reverse-engineering of file names (determining: discovering host,

2


Page 03 of 4

discovering application, discovering process, application instance being discovered, etc.)

In this solution the file name suffix is calculated based on a set of (together always unique) the following data:

Discovering host IP - resolves ISSUE #3


Discovering application id (usually a network port ) - r...