Browse Prior Art Database

Fault localisation on SAS networks via expander intelligence

IP.com Disclosure Number: IPCOM000244384D
Publication Date: 2015-Dec-08
Document File: 3 page(s) / 52K

Publishing Venue

The IP.com Prior Art Database

Abstract

Improving fault detection and localisation on SAS networks with increased expander functionality

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 40% of the total text.

Page 01 of 3

Fault localisation on SAS networks via expander intelligence

A typical SAS network consists of an initiator device connected to one or more endpoints via intermediate expander devices. The initiator does not communicate extensively with expanders, which serve primarily to manage the routing of

connections to the endpoints. If a hardware fault develops on the network, the initiator, an endpoint, or any expander or interconnection that sits between the two could be the cause. To fix the fault, the broken piece of hardware needs to be replaced or repaired. To minimise disruption and repair costs it is advantageous to be able to isolate the fault to the smallest possible subset of components. This disclosure proposes extensions to the behaviour of expanders in order to localise faults more rapidly, and to a smaller subset of components.

Expanders traditionally do not support the scsi read/write commands used by

drives to read and write data, nor do they act as an initiator device on the SAS network. This disclosure details two separate classes of extension to the expander functionality.
1 - An expander is able to respond to scsi read/write commands. An expander responds to a read request by returning data that is hardcoded as a function of the lba of the read request, for example all zero's for LBA 0. An expander responds to a write request by comparing the written data to that

which it would have read from that lba, and reports an error on mismatch
2 - An expander is able to isolate a phy from the SAS network and then issue scsi read requests through that phy, acting as its own initiator.

    This new functionality would be managed via the SES interface supported by many SAS expanders. This new expander functionality can be used by the initiator for a fault localisation process either automatically when faults are observed above a threshold, or triggered manually such as when requested by a service engineer.

The automatic fault localisation process would involve issuing reads and

writes to expanders using any unused bandwidth, and requesting expanders to issue reads to endpoints that are not currently required, such as spare drives. This could be done at all times as a proactive diagnostic tool, or in response to a first error, or error threshold, in order to rapidly build up sufficient data to call out hardware or give the all clear. Manual control of this functionality would give service engineers the ability to test the connection between individual phys, even where the test may cause disruption.

    An initiator has knowledge of the layout of a SAS fabric, such as which expanders lie between it and any endpoint. It also knows the volume of traffic directed at endpoints, and hence flowing through each expander. If traffic from the initiator does not saturate the initiators available bandwidth, the initiator can initiate background IO to expanders. These would preferentially be directed towards the expanders the greatest number of hops away...