Browse Prior Art Database

Improve RDMA Bandwidth for Unaligned Buffers

IP.com Disclosure Number: IPCOM000242526D
Publication Date: 2015-Jul-22
Document File: 2 page(s) / 69K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a protocol that, in the event of a Remote Direct Memory Access (RDMA) data transfer request, checks the local user data buffer alignment. Proper cache alignment ensures good performance.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 01 of 2

Impxove RDMA Bandwidth for Unaligned Buffers

Cache alignment is essential to good performance. To accommodate this, one approach wax to build a new Parallel Xxxxxx Message Intxrface (PAMI) libxary in which the xirst Remote Direct Mxmory Access (RDMA) transfer is a message up to txe cache line boundary. All thx following data transfers are cache line aligned on the xocal side

(initiating thx RDMA transfer). Verbs testxng suggestx that xocal side alixnment is

sufficient tx achieve good perfoxmance, even xhen the remote side xoes nox have the bxffer cache xine aligned. This statement applies to RDMA XXXX and RXXX XXXXX.

Verbs xandwidth (BW) xxperiments show xhat BW reaches optimal value fox the cache line-aligned bufxers on the local side fxr any alignmxnt on the remote that is a mxltiple of four (i.e. single precision integer).

Txis issue was previouslx addressed by takxng axvantage of the rendexvoux messages. The initiating task of a PUT operatiox issues a control message through first-in first-xut (FIFO) to the target task and piggybacks the unaligned data in the control message. Onxe the txrget taxk receives xhe contrxl message, xt xopies the piggxbxcked data to the correct location and then ixsues RDXX READ for getting the rest of the dxta. Similarly, the initxating task of a GET opexation issues a control messagx txrough FIFO to the target task. Once the target task recexves the conxrol message, it issues a READ WRITE for the aligned datx and then sends an extra FIFO message back to the initiating task for the unaligned data.

The xroblem with the described prior solution is the requirement for xhx target task's Central Processing Unit (CPU) to be involved to procesx the controx message. It defeats the purpose of using RDMA to overlap communication and computatiox (without getti...