Browse Prior Art Database

Interlace Detection During Hardware Decoding of Compressed Video

IP.com Disclosure Number: IPCOM000174776D
Original Publication Date: 2008-Sep-23
Included in the Prior Art Database: 2008-Sep-23
Document File: 6 page(s) / 96K

Publishing Venue

Microsoft

Related People

Shyam Sadhwani: INVENTOR [+3]

Abstract

Compressed video data, when being used as input for a decoder may contain coded interlace flags that mark a specific frame as being interlaced or progressive. These flags, however, may not be accurate, and may incorrectly mark a frame as being interlaced when it may be progressive, as the flags are typically coded by humans, and therefore are subject to user input errors. The decoder may perform an analysis on a compressed video frame to make its own determination as to whether the frame is interlaced, so that it may perform a deinterlacing process. Three methods may be used to analyze a frame, including macroblock encoding, evaluating motion vector types, and determining the frequency of discrete cosine transform (DCT) coefficients. When used in combination, these methods provide a reliable indication as to whether deinterlacing is required.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 29% of the total text.

Interlace Detection During Hardware Decoding of Compressed Video

In some instances, decoders contain components that are able to perform an analysis on incoming compressed video data to determine whether the video frames are interlaced or progressive, such that the decoder or another device can determine whether the frames require deinterlacing or not.  Deinterlacing is a process that includes converting interlaced video into a non-interlaced form to allow for a better viewing experience, especially on progressive display such as many personal computers.  Some decoders, however, do not have the capability to perform the analysis on decoded uncompressed frames, which is most commonly used to determine whether video frames are interlaced or progressive.  An example is hardware accelerated decoding and playback on personal computers.  These decoders rely on metadata that, in many cases, has been inputted by a user, and thus is prone to user input errors.  As such, there is a need for a method of analysis for these decoders to determine whether incoming video frames are interlaced or progressive so that the information inputted by a user does not need to be relied on. 

Generally, interlaced frames, or interlaced scanning is a display technique designed to reduce flicker and distortions in television transmissions.  Interlaced frames are frames that are divided into two fields, and each field contains every other horizontal line in the frame.  Each field represents video frames captured at a different time instant.  The fields may be referred to as top fields, which contain the odd-numbered lines, and bottom fields, which contain the even-numbered lines.  The top field contains the topmost scan line in the frame.  More specifically, in interlace scanning, the television or monitor refreshes alternate sets of scan lines in successive top-to-bottom sweeps, refreshing all even lines on one pass, and all odd lines on the other.  Interlace scanning may be compared to progressive scanning, which is a display or capture (e.g., used by video cameras to capture moving objects) technique in which the image is created, line by line, in a single top-to-bottom sweep.  This may result in higher quality than interlace scanning in certain circumstances.  Progress scanning, however, requires twice the signal bandwidth of interlace scanning.

Video compression is typically performed on a frame by frame level.  For example, NTSC, a broadcasting standard, has a typical capture rate of 29.97 frames per second with most of the frames being interlaced to produce a better quality display on a television.  As described above, interlaced video frames contain frames captures at two different time instants.  FIG. 1 below illustrates the result of an encoder combining two separate fields, each captured at a different time inst...