Browse Prior Art Database

Method for efficient TCP processing by segment coalescing

IP.com Disclosure Number: IPCOM000124599D
Original Publication Date: 2005-Apr-29
Included in the Prior Art Database: 2005-Apr-29
Document File: 3 page(s) / 27K

Publishing Venue

IBM

Abstract

A technique is described to improve receive packet processing efficiency in conventional network adapters. Conventional network adapters provide limited performance enhancements for receive TCP/IP traffic. TCP/IP processing is dominated by header processing and data movement. Since the CPU is a shared resource between applications and packet processing, application performance is adversely affected. The header processing overhead can be reduced by coalescing TCP segments at the driver level. It does not entail end-to-end support. It increases CPU availability for applications, thus, boosting performance.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 3

Method for efficient TCP processing by segment coalescing

    Described is a technique to boost the performance of TCP/IP processing and improve application performance on computers. TCP/IP processing consumes a lot of processor cycles due to the long code path, interrupt processing and data movement. One of the techniques that can be used to reduce processor cycles for receive traffic is by combining smaller TCP segments at the driver level before stack processing. By traversing the stack fewer times, more processor cycles are available for applications. This technique is referred to as segment coalescing. In addition, there are different techniques fixed or adaptive that can be used to determine the number of packets that need to be coalesced.

The advantages of the invention are:
1. A significant performance increase when compared against the standard handling of frames and with Jumbo frames.
2. Performance improvement is across all block sizes unlike Jumbo frames.
3. It is designed in a manner so that changes are required only at the driver level. No changes are required at the NIC, TCP/IP stack or application level.
4. The invention is transparent to the operation and it does not require end-to-end support like Jumbo frames (Jumbo frames requires end-to-end support because it is not part of the IEEE Ethernet standard.) It does not require any special switches or routers.
5. It does not incur additional memory operations for coalescing the packets.
6. No additional hardware cost is incurred at the NIC level because all the enhancements are done at the driver level.

    In this design, a stream bucket is a logical queue consisting of packets belonging to the same TCP connection. Also, a stream TCP/IP buffer is a large buffer that can accommodate a fixed number of 1500 byte TCP/IP packets. The size is determined by the number of segments that need to be coalesced. The steps are shown in the figure below. The high level design is described below:

    - The NIC receives a packet (step 1) and DMAs the data into the device driver's buffer (step 2 & 5) and interrupts the CPU. The "fast" interrupt handler is executed and it schedules the "slow" handler (step 3).

    - The slow handler examines the packet and builds a 5-tuple key consisting of the source...