Run RNN-T Demo on Versal

This section shows one RNN-T model-based ASR solution on Xilinx® Versal® device VCK5000. Versal is the first adaptive compute acceleration platform (ACAP). It is a fully software-programmable heterogeneous compute platform that combines Scalar Engines, Adaptable Engines, and Intelligent Engines to achieve dramatic performance improvements over FPGA and CPU implementations among different applications. For more information, see Xilinx Versal website. RNN-T is a sequence-to-sequence model that continuously processes input samples and streams output symbols. The speech recognition model used here is a modified RNN-T model and belongs to the MLPerf Inference benchmark suite. For more information, see mlcommons inference repo.

The hardware kernel is a 40 AIE cores design. INT8 matrix-matrix multiplications are performed on AIE cores, and other functions are implemented in programmable logic (PL) with INT16 precision. The following table shows the total resource utilization for the kernel and platform. UltraRAM resources are mainly used as weights buffer and are shared if multiple kernels are instantiated.

Table 1. Resources
	CLB LUTs	Registers	Block RAM	UltraRAM	DSP Slices	AIE Cores
Available	899712	1799424	967	463	1968	400
Utilized	169163(18.8%)	241657(13.43%)	197(20.37%)	332(71.71%)	82(4.17%)	40(10%)

Note: For RNN-T model, the quantizer and compiler are not ready now. The process of mix-precision quantization and instruction generation are completed manually and produced for this demo. For more information, see DPU-for-RNN.

Run RNN-T Demo on Versal - 2.0 English

Vitis AI RNN User Guide (UG1563)