Browse Prior Art Database

Flexible Multiple Format MAC Circuit

IP.com Disclosure Number: IPCOM000245262D
Publication Date: 2016-Feb-23

Publishing Venue

The IP.com Prior Art Database

Abstract

With increasing data rates of wireless communication protocols (LTE, WLAN), more and more digital signal processing is being shifted to hardware accelerators from DSP processors. High data rates also result in high traffic on the SoC internal bus fabrics leading to the need of low bit width data formats wherever possible. Different data formats – {16I,16Q}, {8I,8Q}, {16I}, {8I} are also required by different signal processing algorithms and different wireless protocols. The existing DSP processors have fixed length (8-bit / 16-bit / 32-bit) data-paths leading to memory wastage, extra cycle consumption for data conversion, and low utilization of AU hardware.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 50% of the total text.

Page 01 of 12

Flexible Multiple Format MAC Circuit

Abstract

With increasing data rates of wireless communication protocols (LTE, WLAN), more and more digital signal processing is being shifted to hardware accelerators from DSP processors. High data rates also result in high traffic on the SoC internal bus fabrics leading to the need of low bit width data formats wherever possible. Different data formats - {16I,16Q}, {8I,8Q}, {16I}, {8I} are also required by different signal processing algorithms and different wireless protocols. The existing DSP processors have fixed length (8-bit / 16-bit / 32-bit) data-paths leading to memory wastage, extra cycle consumption for data conversion, and low utilization of AU hardware. :

Problem Description

There is a need of a hardware accelerator that supports:


Multiplication of mixed data formats (complex and real both)

High and low precision output modes to provide output data in any data format
Simple trigger based interface for start and end of Multiply operations

Proposed Solution

We are proposing a scalable MAC accelerator architecture that supports any data format multiplicand to be multiplied by any data format multiplier and output any data format output.

  Support 16 multiplication modes
{{Ni,Nq}, {N/2i,N/2q}, {Ni}, {N/2i}} x {{Ni,Nq}, {N/2i,N/2q}, {Ni}, {N/2i}}


Page 02 of 12

[i = Real part, q = Imaginary part]

  Support 4 output modes (low/high precision, complex/real) {{Ni,Nq}, {N/2i,N/2q}, {Ni}, {N/2i}}

Simple handshake mechanism - input valid and output valid signals

The circuit contains a memory acting as input buffer of depth 'D' and width 'K', where K is a multiple of N. A memory acting as output buffer of depth 'DO' and width 'K', where K is a multiple of N.

The circuit is explained using block diagrams shown in Figure1 to Figure3. For illustration purposes we have taken N=16

Figure 1



Page 03 of 12

Figure 2



Page 04 of 12

Figure 3



Page 05 of 12

Table1 shows the combinations possible and multiplications per cycle that can be performed with A-format being 16I,16Q

A-format

B-format

Out-format

Output Store

Input Fetch

16I,16Q

16I,16Q

16I,16Q

1 sample / cycle

16I,16Q

8I,8Q

16I,16Q

1 sample / cycle

B fetch = 1/2 cycles

16I,16Q

16I

16I,16Q

1 samples / cycle

B fetch = 1/2 cycles

16I,16Q

8I

16I,16Q

1 samples / cycle

B fetch = 1/4 cycles

16I,16Q

16I,16Q

8I,8Q

1 sample / cycle

16I,16Q

8I,8Q

8I,8Q

2 sample / cycle

B fetch = 1/2 cycles

16I,16Q

16I

8I,8Q

2 samples / cycle

B fetch = 1/2 cycles

16I,16Q

8I

8I,8Q

2 samples / cycle

B fetch = 1/4 cycles

16I,16Q

16I

16I

NA

16I,16Q

8I

16I

NA

16I,16Q

16I

8I

NA

16I,16Q

8I

8I

NA

Table 1



Page 06 of 12

Table2 below shows the combinations possible and multiplications per cycle that can be performed with A- format being 8I,8Q

A-format

B-format

Out-format

Output Store

Input Fetch

8I,8Q

16I,16Q

16I,16Q

1 sample / cycle

A fetch = 1/ 2 cycles

8I,8Q

8I,8Q

16I,16Q

1 sample / cycle

A,B fetch = 1/ 2 cycles

8I,8Q

16I

16I,16Q

1 samples / cycle

A,B fetch = 1/ 2...