Browse Prior Art Database

Link Error Statics Gathering Mechanism

IP.com Disclosure Number: IPCOM000010985D
Original Publication Date: 2003-Feb-06
Included in the Prior Art Database: 2003-Feb-06
Document File: 2 page(s) / 44K

Publishing Venue

IBM

Abstract

Serial optical communications link errors are typically monitored using interrupts to microprocessors. This works well when a microprocessor is dedicated to the link, but in hardware state machine driven links (InterSystem Channel (ISC) used in zSeries z900 processors and InfiniBand channel adapters and switches), a new approach is required. This invention describes hardware state machines required to gather link error statistics with minimal processor involvement.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 49% of the total text.

Page 1 of 2

Link Error Statics Gathering Mechanism

  In serial links such as ESCON and ISC using the 8B/10B code, the first indication of a bit error is a code violation (CV). If the bit error is in a frame, it may be either recognized immediately (CRC error in the information field) or a message timeout condition is detected. In either case, error recovery mechanisms are invoked, and the damaged frame (or operation) is retried. Bit errors in either the idle sequence or a continuous sequence (another type of idle sequence) do not damage frames, but they still must be detected and tracked to determine the overall link error performance (bit error rate).

In many lightly utilized links such as ISC, most of the bit errors occur in idle sequences. In previous ISC implementations, these harmless bit errors cause interrupts to a dedicated microprocessor, and this microprocessor collected the bit error statistics. In the new ISC implementation (ISC3), there is no local microprocessor dedicated to each link, and a processor (the SAP, Service Assist Processor) is used for all maintenance functions. What is needed is a hardware state machine that is capable of collecting bit error information without interrupting the processor each time a bit error is detected. This invention performs such a function. Detailed Implementation - There are two types of bit error events. The first type is the isolated bit error. These are caused by noise events. The second type is a burst of bit errors that may indicate that the receiver has lost bit/byte/word synchronism. If synchronism is re-established within 100 milliseconds, a temporary loss of sync (TLOS) event is recognized. If the loss of sync condition persists for more than 100 milliseconds, a link failure condition is recognized, and the processor is immediately interrupted to handle the error. In this failure case, there is sufficient time to change the physical connection, and the processor must bring up the link from scratch, verifying who is connected to the other end. The figure illustrates the hardware state machine used to capture bit error information. To configure the hardware, the software executes a memory mapped I/O store instruction called the CV Reporting Command. This command loads the ENTRY DELAY register with a value that represents the amount of time that the hardware will wait from the detection of the first subsequent CV or TLOS event to making an entry into the CV STATISTICS FIFO (or simply FIFO). When the ENTRY DELAY value is set to zero, detection of a CV or TLOS event causes an immediate FIFO entry. The CV Reporting Command also sets the WRITE POINTER THRESHOLD register. When the WRITE POINTER reaches the value in the WRITE POINTER THRESHOLD register, an interrupt is generated to the processor. Finally, the CV Reporting Command resets the CV COUNT, TLOS COUNT, WRITE POINTER and READ POINTER to zero and enables or disables processor interrupts. To read the FIFO, the processor executes a memor...