This application note proposes a method to implement wideband beamforming functionality on the AI Engine with the following features:
- A generic framework for matrix multiplication that covers a wide range of matrix sizes and throughput requirements.
- A scalable architecture that only needs a small number of kernels to be developed.
- An example submatrix multiplication kernel design that fits into one AI Engine tile and achieves 85% overall efficiency with low latency.