The input to the FFT consists of a control word of three bits and four data blocks of 1024
samples each. The least significant two bits of the control word,
Sz[1:0]
, indicate the FFT size, that is, N=
4096/2Sz. Bit 2 of the control word specifies whether FFT or IFFT is to
be performed on the data. The following figure shows the definition of the data
blocks of various FFT sizes.
The control word and Data Block 0 and 1 are multiplexed onto an AXI packet stream of 1 GSPS, and Data Block 2 and 3 are multiplexed onto the other packet stream. Packet IDs are allocated by the Xilinx tools during compilation and reported in a JSON file for easy post-processing. For details on AI Engine array packet switching, see the Versal ACAP AI Engine Programming Environment User Guide (UG1076). The following figure shows a simplified timing diagram of the input packet streams of the FFT design.
The output of the FFT is two streams of 1 GSPS samples, and the data format depends on the FFT size. For some applications a different output format might simplify programmable logic design; the C function of FFTz can be modified accordingly.
The linear phase rotation between 1024-point and 4-point FFTs is implemented
in FFTb, FFTc, and FFTd kernels using the sincos()
function built into the AI Engine scalar unit
instead of using a large ROM. This approach saves three AI Engines x 1024 x 4-byte/sample = 12,288 bytes memory, equivalent to
1.5 memory banks. More specifically, a vector of eight twiddle factors is computed
in parallel as follows.
On the right-hand side of the equation, the first term is computed by the
sincos()
function in the AI Engine scalar
unit and the second term is a vector pre-computed before the loop, also using
sincos()
.
Location pinning for AI Engines and buffers are used to pack two FFT modules into a 5x2 AI Engine array with minimum memory conflicts. As shown in Figure 4, all memory banks and AI Engines inside the 5x2 AI Engine array are used by the design to maximize throughput. In Figure 4, the shaded AI Engines are one set of five AI Engines and the un-shaded set are the other set in the 5x2 AI Engine array.