In the case of slow sample rate and small number of coefficients, the single MACC FIR filter is well suited. In the case of high sample rate and/or large number of coefficients, consider using a semi-parallel or parallel FIR filter. As for coefficients, if the number is large and/or the width is high, dual port block RAM is the preferred choice for the memory buffer. A high level implementation example of this design is provided in the following figure. If the number of coefficients and/or their width is small 1 , distributed memory (LUTRAM) can be used as coefficient buffer instead of block RAM. If the data width is small 1 , SRL16 can be used as data buffer instead of block RAM.
- Based on the size, synthesis tools in Vivado Design Suite automatically maps to block RAM or SRL16/LUTRAM. To choose between SRL16/LUTRAM and block RAM, users must compare the timing and resource utilization in both cases to find the optimal solution.
.
For block RAM implementation of the data buffer, the cyclic RAM buffer is used. For small-sized FIR filters (typically those under 32 taps), block RAM can be underutilized as a means to store filter input samples and coefficients. Block RAMs are not as abundant as the smaller distributed RAMs found in a nearby DSP58, making them an excellent option for smaller FIR filters. The following figure illustrates the actual single-multiplier MACC FIR filter implementation using distributed RAM for the coefficient bank and an SRL16 for the data buffer.