Browse Prior Art Database

Error recovery using Buffered layer 2 bridge

IP.com Disclosure Number: IPCOM000012726D
Original Publication Date: 2003-May-22
Included in the Prior Art Database: 2003-May-22
Document File: 1 page(s) / 36K

Publishing Venue

IBM

Abstract

This method describes a way to recover from short-lived errors in an environment where a networking adapter is shared by some of the partitions at the link layer. This recovery is transparent to the upper layer protocols where the error recovery mechanisms are more general and therefore lead to a loss of performance.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 98% of the total text.

Page 1 of 1

Error recovery using Buffered layer 2 bridge

As logical partitioning becomes popular the trend is to create a high number of partitions. This leads to a situation where there may not be enough I/O slots for each partition and some partition would be required to share the I/O using a hosting partition. Typically the hosting partition will switch packets at the link level between the external network and the other partitions. When the hosting partition encounters temporary errors (Hypervisor temporarily out of resources) and is unable to switch the packet it silently drops them. This leads the upper layer protocols to go into congestion control mode based on the sometimes faulty assumption that the congestion in the network is long-lived. This leads to a significant loss of performance and end-user satisfaction.

Instead of dropping packets silently, we propose the hosting partition hold the packet on a special queue. Delivery should be reattempted when another packet is to be resent or after a predetermined amount of time. If a failure is encountered on the subsequent attempt then the packet may be dropped based on the assumption that the initial error condition is long-lived. The holding and redelivery parameters can be configurable based on the needs of the operator. This method allows for a short-lived error recovery method that is transparent to the upper layers and allows the upper layer protocols to handle long-lived errors.

Disclosed by International Busines...