The PL master kernels are the
ulbf_coeff kernels. A
dlbf_data PL kernel stores the reference input data matrices for the downlink subgraph in the AI Engine graph. The
dlbf_coeff PL kernel stores the reference input coefficients for the downlink subgraph. The
ulbf_data PL kernel stores the input data for the uplink subgraph. The
ulbf_coeff stores the input coefficient data for the uplink subgraph.
Open the Vivado projects for these PL kernels and review their source code. They are all composed of the same modules: a AXI BRAM Controller IP, a control status register (CSR) module, a clock domain crossing (CDC) module, and multiple data master modules. The data master modules are initialized with the reference input data and input coefficients from
*_hex.mem files in the
*_hex.memfiles were generated by a python script that converted the decimal data in the
*.txt files to hexidecimal data. An example conversion is shown below:
#Decimal data in dlbf_cin00.txt -1893 3687 -6157 -1324 #Hexidecimal conversion in dlbf_cin00_hex.mem fad4e7f30e67f89b
Where rightmost decimal data (-1893) is converted to leftmost hexidecimal data (f89b).
Below is a block diagram of how data in the PL Master kernels is requested by CIPS and sent out to the AI Engine.
Each PL master kernel hooks up to one of the 16 AXI4-Lite PL interfaces on the custom platform built in Module 01 (Creating a Custom Platform). Through this connection, the CIPS block can send AXI control signals to the data master modules and receive AXI status signals from the data master modules.
AXI BRAM Controller: The AXI BRAM controller writes the control signals to the CSR module and reads the status signals from the CSR module at 100 MHz.
Control Status Register (CSR) Module: The CSR module is a register interface that the AXI BRAM controller accesses to access the data masters. Below is the control and status register map for one data master module.
Control and Status Register Address Map
|Register Space Offset||Bits and Name||R/W?||Description|
|0x0||[31:0] ID||R||32 bit ID register.|
|0x4|| RESET||W||1: assert, 0: deassert. Also assigned to the m_axis_rst_bram input in the CSR module.|
|0x4|| GO||W||1: start PL traffic, 0: stop PL traffic. Also assigned to the go_bram input in the CSR module.|
|0x8||[11:0] BLOCK_SIZE||W||Sets the block size of stream frame. Block size is the number of 64-bit TDATA packets to send to the AI Engine. TLAST is asserted for every
|0xC||[11:0] NITER||W||Sets the number of iterations of the data to go through. The number of iterations is the number of
|0x10||[15:0] ROLLOVER_ADDR||W||When BRAM addresses reach this rollover address, it will reset to address 0. In this design, the rollover address is set to the address of four
|0x20|| MASTER_DONE||R||When this status register becomes 1'b, the data master is done sending data to AI Engine. Also assigned to the
The CSR Module RTL definitions are located here:
dlbf_data/hdl/ulbf_data_csr_cntrl.v dlbf_coeffs/hdl/dlbf_coeffs_csr_cntrl.v ulbf_data/hdl/ulbf_data_csr_cntrl.v ulbf_coeffs/hdl/ulbf_coeffs_csr_cntrl.v
Clock Domain Crossing (CDC) Module: The control and status signals sent to the CSR module sync up with the data master modules through a clock domain crossing (CDC) module. It converts the 100 MHz control and status signals from CIPS to 400 MHz signals. The data master modules operate at 400 MHz. It also works the other way as well (converting 400 MHz signals from the data master modules to 100 MHz signals for CIPS).
Data Master Modules: These modules contain BRAM instances that store the input data that is sent to the AI Engine. They are initialized by
data/*_hex.memfiles with input data. There are four data master modules in the
dlbf_coeffsPL kernels. There are eight data master modules in the