Publication Date: 2017-Feb-10
Method to detect Compute host state using storage fabric infrastructure .

A data center comprises of different components including hosts, network switches and storage fabrics. To ensure availability of workloads, these data centers also typically employ automatic recovery of virtual machines when the compute hosts on which the VMs are running go down for some reason. In case of a compute host failure the network connectivity is validated to know if the host is running, this can be usually in the form of a heartbeat or any other network based periodic detection mechanism from the controller node.

In this method there is a possibility of a false positive when there is a malfunction in network component and the compute host is deemed down. Any automatic recovery procedures running on this data center could start recovering the resources e.g. VMs, based on the false positive while in fact the actual compute host is in good health.

This proposal is to make use of the additional managed elements in the data center to make sure that the host is actually down before triggering the recovery operation.

In the data center the management node to compute node connectivity is typically TCP/IP based. The Compute nodes then may be connected to different storage elements via Fiber Channel connectivity. The data center management also manages/monitors the storage fabric infrastructure. The FC switches used for the storage connections reflect the state of the endpoints(wwpn) which are connected from the compute hosts. When the compute host is down the FC switch shows the (wwpn) as inactive. The solution tries to utilize the Compute host - > St...