Method for optimizing DMA Translation Performance through Multiple I/O Page Sizes Within A single Translation Table
Publication Date: 2015-Apr-09
The IP.com Prior Art Database
Described is a method for optimizing DMA translation performance through multiple I/O page sizes within a single translation table.
Page 01 of 3
Method for optimizing DMA Translation Performance through Multiple I /
Within A single Translation Table
Current implementations of I/O translation tables allow the hardware to be configured using one of several supported I/O mapping page sizes such as 4K, 64K, 256MB, etc. However, operating systems and device drivers typically perform DMA operations on data of varying sizes resulting in two inefficiencies:
a.) If the I/O mapping page size is small, for example 4K, then larger DMAs require multiple I/O translate table entries to be set. These multiple entries can then cause I/O translation cache thrashing as well as increased latency in the I/O operation as the hardware must fetch each I/O translation from platform memory.
b.) If the I/O mapping page size is set to a large size, then the operating system loses the protection of the I/O translation table as the DMA hardware would be allowed to DMA to and from platform memory outside the intended DMA buffer.
This invention addresses these inefficiencies by describing a hardware and firmware implementation which allows for a single I/O translation table to support multiple I/O page mapping sizes.
Each I/O translation table entry would contain a field that describes the I/O page mapping size. The I/O page mapping size would also be included in the information stored in the I/O translation cache located within the hardware. The DMA hardware would interrogate this field within cached translation entries to determine if an I/O address falls within the memory range mapped by that translated entry.
This invention allows for I/O translation table entries to more closely fit the size of a DMA operation. Overall I/O performance is improved since the hardware must fetch less translation entries from platform memory. By retaining the use of smaller I/O mapping page sizes, address protection of memory not involved with the DMA operation is preserved.
Current hardware implementations of I/O translation tables have a Base Address Register (BAR) associated with a specific endpoint which is programmed by firmware with the following information:
1.) Platform memory address of the start of the I/O translation table
2.) Size of the I/O translation table
3.) I/O mapping page size of each entry in the I/O translation tabl...