Computer Provisioning Systems
Publication Date: 2013-Mar-14
The IP.com Prior Art Database
Method for resolving failed provisions using a systematic approach using alternative scenarios which enables as much diagnosis as possible to be performed in an automated fashion.
Page 01 of 2
Computer Provisioning Systems
In a computer provisioning system, systems are provisioned on request: When a user wants to request a system be made available to them, they enter their requirements; if there is suitable free resource available in the pool, this resource is provisioned for them. When they are done with the provision, its resources are released for others to use. Machines are allocated and de-allocated regularly so they are only available when they are needed instead of persistently taking up valuable computing resources
Occasionally, due to various circumstances, for example network issues, faulty hardware, issues introduced which are specific to a operating system provision, random unfortunately timed cosmic rays etc., may result in a provision of a machine not to complete successfully. This disclosure describes a mechanism to perform some initial diagnosis what to do with a failed machine in the provisioning system, whilst job goes off to another machine, the current machine stays in failed state.
By default, if no further action is taken, then subsequent provisions will be attempted on the same hardware. If the failure was due to a hardware fault, then it is highly likely that the same failure will recur, thus wasting time attempting a deployment which is highly unlikely to succeed. We need an effective way of preventing re-provision on a machine which is unlikely to succeed, based on earlier results.
A more efficient, automated way of identifying if the problem lies in the infrastructure or on the specific hardware is desirable to reduce the amount of time spent in the offline state by any given system.
Typically if a provision to machine fails the provisioning system will be taken offline so as not to re-use it immediately until the provisioning administrator can analyse the problem and attempt to verify whether it is a hardware issue (in which case the hardware can be replaced) or a provisioning system infrastructure issue. If it is an infrastructure issue, then the hardware can be brought back online for subsequent attempts.
If the problem was not infrastructure related then the system administrator can perform all required diagnosis manually. Once a provisioning administrator determines that a provision on a piece of hardware fails they will attempt to identify whether the fault was due to a failure of, for example:
A problem with a specific operating system, which may or may not be specific to
the hardware type
Rather than simply leaving the machine either to be retried later, or marking the machine as unavailable, we can go through a set of diagnosis steps in an attempt to narrow down the cause of the failure. This can be done using a systematic set of retries of different provisions on the hardware, or the same provision on alternate hardware. Once a failure occurs, control of the machine is handed to a workflow that will run the following scenarios:
Initiate a new provisi...