Conflictless simultaneous data gathering in distributed environment using asynchronous remote script executing.
Publication Date: 2014-Jan-08
The IP.com Prior Art Database
Disclosed is a framework for gathering data in distributed environment based on remote script execution. It solves cocurrency issues when many data gathering applications are trying to execute the same script at the same time in a given file system location. A naming convention for file suffixes is proposed that is meangful (non-random) and allows for full script execution context recovery.
Page 01 of 4
Conflictless simultaneous data gathering in distributed environment using asynchronous remote script executing .
The host configuration data (the operating system details or installed services) can be discovered using a technique called remote script execution. The process is depicted on figure 1.
The detailed steps are as follows:
The discovering application first prepares a script containing all the logic necessary to discover a server of known type (e.g. WebSphere Application Server).
The script is copied to the server over the network. Usually an SSH connection is
The script is executed on the remote server and gathers the data to an output file
stored on the disk.
The output file is copied back over the network to the discovering server and then
further processed locally.
The described procedure works well in most cases, but there are some possible scenarios in which files can overwrite one another on the remote server. All threats are listed on figure 2.
Page 02 of 4
The issues #1 to #3 are related to multiplying the discovery processes, which can run concurrently on the same or different machines. The issues #4 to #6 are caused by multiplying services that in the end use the same file system. In each case the same file name may be used by different discovery processes at the same time, leading to:
general output/input failure
the script file overwriting
the output file overwriting
corrupted file content (mixed data from different sources)
inability to determine proper log file name (after system failure)
The common solution is to use the randomly generated UUID as a part of script/output/log file names. The UUID is known to be unique, so it prevents files from overwrites. However this solution has the following drawbacks:
Every generated UUID is different (unique) - that means that generated UUID must be remembered in order to later create output or log file names. In other words: even knowing the environment in which UUID was originally generated it is impossible to regenerate the same UUID.
UUID is meaningless - it does not carry any additional information about the
execution context (e.g discovering host address).
The proposed solution is to use a file naming convention, that:
guarantees file name uniqueness
allows matching script files with its corresponding output and log files
enables reverse-engineering of file names (determining: discovering host,
Page 03 of 4
discovering application, discovering process, application instance being discovered, etc.)
In this solution the file name suffix is calculated based on a set of (together always unique) the following data:
Discovering host IP - resolves ISSUE #3
Discovering application id (usually a network port ) - r...