Browse Prior Art Database

Method and System for Non-Disruptive Movement of Applications for Providing High Availability

IP.com Disclosure Number: IPCOM000190255D
Original Publication Date: 2009-Nov-23
Included in the Prior Art Database: 2009-Nov-23
Document File: 3 page(s) / 97K

Publishing Venue

IBM

Abstract

A method and system for moving applications non-disruptively for providing high availability is disclosed. More specifically, a method and system for moving an application from one node to another node non-disruptively by using workload partitions (WPAR) mobility is provided. The method enables maintenance of high availability of the application upon failure of a resource supporting the application.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 3

Method and System for Non-Disruptive Movement of Applications for Providing High Availability

Disclosed is a method and system for moving an application from one node to another node non-disruptively by using workload partitions (WPAR) mobility. In order to provide high availability for the application on one or more resources, high availability for cluster multi-

processing (HACMP) checks for failure in the one or more resources. The resources can

be one or more of a volume group (VG) and a file system (FS) associated with a WPAR hosting the application. In case HACMP detects a failure in one or more resources, the WPAR is moved to a different node and is restarted by enabling the application to continue from the same point where it was stopped in the earlier node.

The method involves indentifying a failure of one or more resources, such as a VG and a FS associated with a WPAR hosting the application. In an instance, the HACMP is configured to detect the failure of a VG when a return event of, "LVM

_SA

error log of the application. This return event is generated when an input/output (I/O) operation of the application is performed on a failed VG. Upon receipt of the return event, the WPAR which is hosting the application is moved from an existing node to a different node. However, when an application is moved from an existing node to a different node, the existing systems do not support the application to repeat the instructions which resulted in error events due to failure of one or more resources. Therefore, even after moving the application from the existing node to the different node, success of each instruction is not assured.

Consider execution of instructions corresponding to the following sample application on a WPAR:

1. main() {
2. int fdwr, fdwr2.
3. char buff[]="Test Program";
4. char buff2[]="

New Progra

_fs",1);

6. write(fdwr,buff, sizeof(buff));
7. write(fdwr,buff2, sizeof(buff2));
8. close(fdwr);
9. }

In case, a VG fails, then write operation on the line number 6 will return an error event "LVM

_QUORCLOSE" in the error log. Based on this return event, if the HACMP moves

the WPAR from the existing node to a different node. Once the WPAR is restarted on the different node, the sample application will start executing from line number 7 onwards. However, the application needs to be executed from the line number 6 as the execution of the instruction as the line number 6 was not successful. The publication discloses a method and system to overcome the above shortcomings so as to handle each of the I/O instructions of the

_QUORCLOSE" is observed in an

m

";

5. fdwr=open("/new

_SA

1

Page 2 of 3

application.

Fig. 1 illustrates the method and system for non-disruptively moving application from an existing node to a different node.

Figure 1

Consider an application hosted on WPAR1, in case a static kernel detects fa...