Browse Prior Art Database

Mincer: A tool for the efficient parallel determination of intermittent functional problems

IP.com Disclosure Number: IPCOM000239098D
Publication Date: 2014-Oct-10
Document File: 7 page(s) / 146K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a problem determination tool, which, when run, selects an experimental configuration expected to produce the greatest amount of information based on the experiments that have run and any experiments that are currently running. It is primarily intended for use in a distributed machine farm to speed problem determination and it relies on a job server infrastructure to distribute its execution across machines in the machine farm.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 19% of the total text.

Page 01 of 7

Mincer: A tool for the efficient parallel determination of intermittent functional problems

When faults are found in large, complex systems, problem determination (i.e. debugging) can be a very difficult and time-consuming task. Generally, debugging requires the adoption of a formal experimental methodology to test specific sub-components of the failing system in order to identify which is at fault and how it is failing to correctly operate , so that it can be fixed.

Many large, complex systems employ optimizations that add complexity and sources of potential defects. The very nature of these systems can result in failures that occur only intermittently. This makes it difficult to be certain whether a particular debugging experiment has passed because the disabled optimization is involved in a failure, or if it has passed simply because the correct conditions to provoke the failure have not occurred. As a result, it may be necessary to re-run an experiment numerous times to reach an acceptable level of certainty that an experiment has succeeded because the optimizations disabled in that experiment

were responsible for the failure. A natural way to accelerate the debugging process is to simultaneously run multiple experiments. The trouble with this obvious approach is that it is wasteful of machine resources.

The solution is to select experimental parameters for each test run in order to maximize the information it is expected to provide , thus making the best possible use of the available machine resources. This approach enables the use of multiple machines to reduce the elapsed time required to isolate a defect without unduly wasting machine resources.

The novel contribution is a problem determination tool, which, when run, selects an experimental configuration expected to produce the greatest amount of information based on the experiments that have run and any experiments , which are currently running. It is primarily intended for use in a distributed machine farm to speed problem determination and it relies on a job server infrastructure to distribute its execution across machines in the machine farm.

This problem determination tool is an executable application that exists in a layer between a job server, which allocates work to a farm of distributed slave machines, and the execution of an experimental test case on these slave machines. An instance of the tool is started by a slave machine when that machine is selected to run an experiment, and this instance of the tool uses a transactional database to optimally select experimental parameters for the test to be run using Bayesian inference and Shannon information theory. If no solution has been reached, then the tool launches an experimental run using the configuration it has computed and records the result. The problem determination tool maximizes the information generated per experiment , minimizing the average number of runs required to arrive at a statistically significant...