Method To Detect, Monitor And Handle Reliability Issues For FCoE/ Ethernet Devices
Publication Date: 2010-Nov-24
The IP.com Prior Art Database
Today's customer environments and networks are undergoing a transformation. With recent FCoE deployments, 10Gb SFP+ Copper cable solutions (1m, 3m, 5m) are being widely deployed in labs worldwide. These cables, also known as Twinax cables, are used for in-rack connections between servers (blades and standalone) and top-of-rack FCoE Fibre Channel Forwarder Switches. The SFP+ Copper Cables differ from traditional optical cables in that with the SFP+ directly attached to the cable, it is possible for one side of the cable to fail while the other side continues to function properly.
Page 01 of 2
Method To Detect, Monitor And Handle Reliability Issues For FCoE / Ethernet Devices
The subject device has the following: a) a time-out timer used as a temporal sample-window that is both user-defined and resettable to zero, b) a bounce
resettable to zero, c) a user-defined threshold of bounces to compare the bounce
against, and d) a way to detect link failures. There are many different ways a bounce condition can be detected, this article teaches one such method.
When a link failure condition is detect by an FCoE device, it must perform what is known in the art as FCoE initialization. This FCoE initialization protocol uses special frames know in the art as FCoE Initialization Protocol (FIP) frames. The inventive device monitors for a certain type of FIP frame called FIP Fabric Login (FIP Flogi). Once a FIP flogi is received, then it is know that a link failure has occurred.
Once the bounce condition has been detected, then corrective action needs to be performed. How the multi-path driver is made to fail-over changes, based on what type of device is running the device described in this article. For example, if the device is running as part of an Nx port, then the multi-path driver could be notified through a simple API call. If the device is running in a Ethernet switch, then an Internet Control Message Protocol (ICMP) frame could be sent to the Nx port that contains the multi-path driver.
Figure 1 shows the process that detects a bouncing port and the process to fail-over to a stable port. In this figure, there is a timer that is used to determine the failure rate per user specified time (temporal sample-window). Solely for the purposes of an illustrative example, the numerical values of one second may used as the user-specified time and...