This section shows one RNN-T model-based ASR solution on Xilinx® Versal® device VCK5000. Versal is the first adaptive compute acceleration platform (ACAP). It is a fully software-programmable heterogeneous compute platform that combines Scalar Engines, Adaptable Engines, and Intelligent Engines to achieve dramatic performance improvements over FPGA and CPU implementations among different applications. For more information, see Xilinx Versal website. RNN-T is a sequence-to-sequence model that continuously processes input samples and streams output symbols. The speech recognition model used here is a modified RNN-T model and belongs to the MLPerf Inference benchmark suite. For more information, see mlcommons inference repo.
The hardware kernel is a 40 AIE cores design. INT8 matrix-matrix multiplications are performed on AIE cores, and other functions are implemented in programmable logic (PL) with INT16 precision. The following table shows the total resource utilization for the kernel and platform. UltraRAM resources are mainly used as weights buffer and are shared if multiple kernels are instantiated.
CLB LUTs | Registers | Block RAM | UltraRAM | DSP Slices | AIE Cores | |
---|---|---|---|---|---|---|
Available | 899712 | 1799424 | 967 | 463 | 1968 | 400 |
Utilized | 169163(18.8%) | 241657(13.43%) | 197(20.37%) | 332(71.71%) | 82(4.17%) | 40(10%) |