Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Method for a fast decoder architecture for H.264 video

IP.com Disclosure Number: IPCOM000033844D
Publication Date: 2004-Dec-29
Document File: 5 page(s) / 51K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method for a fast decoder architecture for H.264 video. Benefits include improved functionality and improved performance.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 52% of the total text.

Method for a fast decoder architecture for H.264 video

Disclosed is a method for a fast decoder architecture for H.264 video. Benefits include improved functionality and improved performance.

Background

Conventionally, the H.264 video decoder is implemented with MB entropy decoding/MB reconstruction, and MB deblock filtering as shown in Figure 1.

              With deblock filtering delayed until the entire frame is reconstructed, data cache misses can occur because the entire frame must be loaded before filtering. This procedure involves a large amount of data, such as 112.5 KB for quarter video graphics array (QVGA) pictures. The cache misses can be avoided by filtering each MB immediately after it is reconstructed, when its decoded coefficients (384 bytes of each MB) are still in cache (see Figure 2).

General description

              The disclosed method is a decoding architecture for H.264 video. The method separates residual entropy decoding and MB reconstruction into two loops. Additionally, reconstruction is merged with deblock filtering into one loop.

              The disclosed method reduces the cache competition between context-adaptive variable length coding (CAVLC) decoding look-up tables (LUTs) and reference frames. The method improves the cache performance by filtering each MB immediately after it is reconstructed. For features like flexible macroblock ordering and arbitrary slice order in H.264, each MB is decoded in exact raster scan order even if they are not compressed and received in this order. This approach improves memory performance over random decoding. As a result, the method is more efficient on platforms for which memory performance is critical to whole-application performance. Measurements indicate a 7% to approximately 15% performance improvement to the H.264 video decoder on application processorsfor cellular and hand-held devices.

 


Advantages

              Some implementations of the disclosed structure and method provide one or more of the following advantages:

•             Improved functionality due to providing a fast decoder architecture for H.264 video

•             Improved performance due to improving memory performance by decoding MBs in exact raster scan order

•             Improved performance due to reducing the cache competition between CAVLC decoding LUTs and reference frames

Detailed description

              The disclosed method addresses performance issues from data cache contention and intra prediction. Data cache contention occurs between variable length coding (VLC) LUTs (entropy decoding) and MB reconstruction frame data (reference and reconstructed frames). During INTRA prediction, which requires unfiltered coefficients for reference, cycles are added to store the rightmost column and bottom row coefficients of each MB. This overhead can counteract the cache miss gain benefit from prompt deblock filtering. As a result, the disclosed method separates MB entropy decoding and MB reconstruction which keeps deb...