Browse Prior Art Database

Nonintrusive Selective Communications Trace Function over InfiniBand

IP.com Disclosure Number: IPCOM000010133D
Original Publication Date: 2002-Oct-24
Included in the Prior Art Database: 2002-Oct-24
Document File: 3 page(s) / 42K

Publishing Venue

IBM

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 37% of the total text.

Page 1 of 3

Nonintrusive Selective Communications Trace Function over InfiniBand

  Disclosed is a system for tracing data traffic across a communications channel running over a specific connection (known as a Queue Pair or QP) across an InfiniBand (IB) link. The trace is performed full-speed with no performance impact to the data traffic occurring on the link and is noninvasive such that the act of tracing will not change the behavior and timings of the system being examined. The trace facility imposes no changes or additional processing requirements into the host communications stack. This trace function also requires no additional debug hardware or preparation when tracing is enabled because the trace facility is implemented as part of the base product, thus eliminating the need for an external tracing product. The tracing is selectively enabled on a QP basis and includes key information about the message along with a timestamp and a programmable amount of the data payload of the message. The timestamp enables correlation between send and receive of a given QP and also allows the timing relationship of events occurring on different QPs to be examined.

While this description envisions usage for performing a trace for a LAN connection, the trace facility is general purpose. The tracing is not dependent upon the connection being a LAN connection, and any type of QP (e.g., a QP running a storage protocol such as SCSI RDMA Protocol (SRP)) can also be traced with no changes required to the trace facility.

The trace system is described in two parts. The first part describes the component pieces necessary for the invention. The second part describes the sequence of events necessary to perform a trace utilizing the component pieces described in the first part. The system is described assuming that it is implemented in an adapter which acts as an IB to PCI-X bridge; however, this is for example purposes only and usage is not limited to only such an adapter.

Part 1 - Components

A. An IB engine (e.g., IB to PCI-X) which uses a wrapping queue of command blocks. The command blocks are built as part of normal hardware processing of IB messages and include info on command, timestamps (begin, end), the first 4KB of data payload (other implementations might contain a differing amount), and complete message headers for unreliable connections. The command blocks are small which allows a significant amount per QP without requiring a large adapter memory. There is a separate wrapping queue of command blocks for each QP. The IB engine automatically uses command blocks from the queue as needed using head/tail pointers with no firmware intervention required. The current contents of the head/tail pointers are readable by firmware. When the IB engine processes a message, the first 4KB of each message are stored into the command block. If the message is longer than this, then the remainder is stored in other transient memory locations in the adapter prior to being trans...