PL Master Kernels - 2023.2 English

Vitis Tutorials: AI Engine (XD100)

Document ID
XD100
Release Date
2024-03-05
Version
2023.2 English

The PL master kernels are the dlbf_data, dlbf_coeff, ulbf_data, and ulbf_coeff kernels. A dlbf_data PL kernel stores the reference input data matrices for the downlink subgraph in the AI Engine graph. The dlbf_coeff PL kernel stores the reference input coefficients for the downlink subgraph. The ulbf_data PL kernel stores the input data for the uplink subgraph. The ulbf_coeff stores the input coefficient data for the uplink subgraph.

Open the Vivado projects for these PL kernels and review their source code. They are all composed of the same modules: a AXI BRAM Controller IP, a control status register (CSR) module, a clock domain crossing (CDC) module, and multiple data master modules. The data master modules are initialized with the reference input data and input coefficients from *_hex.mem files in the data/ folder.

PL Master Kernel Vivado Screenshot

The *_hex.mem files were generated by a python script that converted the decimal data in the *.txt files to hexidecimal data. An example conversion is shown below:

#Decimal data in dlbf_cin00.txt
-1893 3687 -6157 -1324

#Hexidecimal conversion in dlbf_cin00_hex.mem
fad4e7f30e67f89b

Where rightmost decimal data (-1893) is converted to leftmost hexidecimal data (f89b).

Below is a block diagram of how data in the PL Master kernels is requested by CIPS and sent out to the AI Engine.

PL Master Kernel Block Diagram

Each PL master kernel hooks up to one of the 16 AXI4-Lite PL interfaces on the custom platform built in Module 01 (Creating a Custom Platform). Through this connection, the CIPS block can send AXI control signals to the data master modules and receive AXI status signals from the data master modules.

  • AXI BRAM Controller: The AXI BRAM controller writes the control signals to the CSR module and reads the status signals from the CSR module at 100 MHz.

  • Control Status Register (CSR) Module: The CSR module is a register interface that the AXI BRAM controller accesses to access the data masters. Below is the control and status register map for one data master module.

Control and Status Register Address Map

Register Space Offset Bits and Name R/W? Description
0x0 [31:0] ID R 32 bit ID register.
0x4 [0] RESET W 1: assert, 0: deassert. Also assigned to the m_axis_rst_bram input in the CSR module.
0x4 [4] GO W 1: start PL traffic, 0: stop PL traffic. Also assigned to the go_bram input in the CSR module.
0x8 [11:0] BLOCK_SIZE W Sets the block size of stream frame. Block size is the number of 64-bit TDATA packets to send to the AI Engine. TLAST is asserted for every number of cycles. Also assigned to the block_size_bram input in the CSR module.
0xC [11:0] NITER W Sets the number of iterations of the data to go through. The number of iterations is the number of data chunks to send to the AI Engine. If this set to 0, data will be transmitted to the AI Engine forever. Also assigned to the niter_bram input in the CSR module.
0x10 [15:0] ROLLOVER_ADDR W When BRAM addresses reach this rollover address, it will reset to address 0. In this design, the rollover address is set to the address of four chunks of data (that is, 4*). Also assigned to the rollover_addr_bram input in the CSR module.
0x20 [0] MASTER_DONE R When this status register becomes 1'b, the data master is done sending data to AI Engine. Also assigned to the m0_done_bram input in the CSR module.

The CSR Module RTL definitions are located here:

dlbf_data/hdl/ulbf_data_csr_cntrl.v
dlbf_coeffs/hdl/dlbf_coeffs_csr_cntrl.v
ulbf_data/hdl/ulbf_data_csr_cntrl.v
ulbf_coeffs/hdl/ulbf_coeffs_csr_cntrl.v
  • Clock Domain Crossing (CDC) Module: The control and status signals sent to the CSR module sync up with the data master modules through a clock domain crossing (CDC) module. It converts the 100 MHz control and status signals from CIPS to 400 MHz signals. The data master modules operate at 400 MHz. It also works the other way as well (converting 400 MHz signals from the data master modules to 100 MHz signals for CIPS).

  • Data Master Modules: These modules contain BRAM instances that store the input data that is sent to the AI Engine. They are initialized by data/*_hex.mem files with input data. There are four data master modules in the dlbf_data and dlbf_coeffs PL kernels. There are eight data master modules in the ulbf_data and ulbf_coeffs PL kernels.