Methods and apparatus for implementing dot products and performing data reformatting in a SIMD vector-media unit
Original Publication Date: 2003-Dec-18
Included in the Prior Art Database: 2003-Dec-18
According to the present invention, there is provided one or more of (1) an architecture implementing a dot-product function without limiting or impacting the main data flow, yet having the capability to share most of the data path of the SIMD unit, with a final adder the only additional hardware needed. Traditional reduction units are arranged as a separate unit, and thus require additional chip area; (2) the architecture of the scalar result file, which allows itself to be accessed as either single scalar values, or by an instruction-specified n-element SIMD word to combine data as would be needed to pack data in a specific format. This register file offers several benefits, as it allows dynamic data reformatting (in particular, but not limited to, packing) and decouples the critical path to update the SIMD register file from the worst-case pipeline delay associated with dot product execution; (3) a scalar access file which can be used to implement efficient data reformatting, in addition to the previously described use as target and staging area for dot-product or other computational operations.