Browse Prior Art Database

Auto restart of Virtual Machines post failure of Systems

IP.com Disclosure Number: IPCOM000206869D
Publication Date: 2011-May-12
Document File: 4 page(s) / 92K

Publishing Venue

The IP.com Prior Art Database

Abstract

Main objective of this article is to restore the failure system as soon as possible with the existing resources.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 41% of the total text.

Page 01 of 4

Auto restart of Virtual Machines post failure of Systems

Today energy is a biggest constraint all over the world. Power disruptions is not uncommon for any part of the world. The System failures can happen for various reasons and one common reason is Power failure. The power failure not only brings down the systems but also it requires considerable human intervention & efforts to bring back the systems to the state as close as possible before the failure.

The existing power back-up mechanisms can help to certain extent, but considering the power outages happening its always possible that these power back-up systems can fail.

The other common reason for system failures are hardware failures or software bugs, even in these case the time taken to bring back the systems to a working state is considerably higher.

The high availability solutions that exists today is to mainly ensure the availability to end customers, but it doesn't resolve the issue of bringing back the failed system in a minimum amount of time and with minimal or zero human intervention.

For e.g. A company XYZ has data centre located in Alaska, USA and a DR (Disaster Recovery) site located for that in Houston, USA. Due to storm in Alaska, there is a power failure which brought down the data centre and Houston DR site is able to pick up the load that Alaska Data Centre is handling. Unfortunately the Houston Data Centre site will not be able to handle the load continuously, hence the Alaska Data Centre should be brought on line with minimal time. The POWER failure in Alaska lasted for couple of hours and restored, but the data centre doesn't come on line automatically, because it requires manual power recycling of various servers located in the Data Centre. This means a system administrator need to fly to Alaska to bring back the systems on line and also restore the data. Hence the cost involved here is high including manpower, travel costs, etc.,

In other cases, for some small and medium entrepreneurs may not opt for High Availability or Disaster Recovery solutions. For them down-time due to power failures is containable cost, but they are looking for faster up-time of systems and application post power failure. With existing technology it requires heavy human intervention to bring back the systems post power failure or abrupt failure due to other reasons.

The existing solutions today don't address this kind of problems. Majority of the High Availability or Disaster Recovery solutions work on hardware redundancy to solve the availability problem, but no solution exists today to reduce the time to bring back the failed system to a working state.

This idea mainly deals with minimising human intervention and ensure reduction of time to bring back the failed system to working state.

Auto power start is a feature available in certain systems, for instance IBM POWER(TM) systems has this capability. This feature works when system powers on after a abrupt power failure. During...