Bare-Metal Source Code - 2023.2 English

Vitis Tutorials: AI Engine (XD100)

Document ID
XD100
Release Date
2024-03-05
Version
2023.2 English

This section dives into the baremetal_src code and describes the purpose of each file. Open the baremetal_src/*.cpp files to get a sense of what the source code does. A summary of the files is given below.

  • dlbf_din.cpp

  • ulbf_din.cpp

  • dlbf_cin.cpp

  • ulbf_cin.cpp

These files contain the data input for the dlbf_data, ulbf_data, dlbf_coeffs, and ulbf_coeffs PL kernels. These PL kernels were already initialized with the input data from the *_hex.mem data files. The PS host application checks that the PL kernels were initialized correctly by comparing the BRAM content in the PL kernels with the data in these files.

The dlbf_gold0.cpp and ulbf_gold0.cpp files contain the golden data output expected from the AI Engine. The AI Engine generates the output data and stores it in the 1dlbf_slave and ulbf_slave PL kernels. The PS host applications check the URAM content in these PL kernels and compare it to the expected golden output in these files.

The dlbf_utils.cpp, ulbf_utils.cpp, dlbf_utils.h, and ulbf_utils.h files contain utility functions that help the main PS host application check the input data, verify the output data, reset the PL IPs, and more.

Open the dlbf_utils.cpp file. There are three global variable arrays: dlbfDataAddr, dlbfCoeffsAddr, and dlbfSlaveAddr. These are arrays that contain variables that start with XPAR_*. The definitions of these variables are located in the build/vck190_baremetal/psv_corexa72_0/standalone_domain/bsp/psv_cortexa72_0/include/xparameters.h file. The xparameters are the base addresses the PS host application uses to access the control and status registers of the PL kernels.

The utils.h and utils.cpp files contain functions that are common to the dlbf and ulbf operations. For now, it contains one function, extractIQ, which returns the imaginary and real parts of a given integer.

The params.h file contains all the global variables that are used in the PS host application. Note that each PL kernel has its own set of global defines in this file.

In Module 03, you learned that the PL kernels contain a control status register (CSR) module. The PL kernels have been designed so that the CIPS block can access the registers in the CSR block to control the data masters or the RAM slave in the PL kernels. The CIPS block contains the A72 processor on which the PS host application runs. These global defines are the addresses the host application uses to directly access the CSR registers in the PL kernels. Below is a table of the dlbf_data CSR registers.

Control and Status Register Address Map

Register Space Offset Bits and Name R/W? Global Defines (params.h) Description
0x0 [31:0] ID R DLBF_DATA_REG_OFFSET_ID 32-bit ID register.
0x4 [0] RESET W DLBF_DATA_REG_OFFSET_RESET 1:assert, 0:deassert. Also assigned to the m_axis_rst_bram input in the CRS module.
0x4 [4] GO W DLBF_DATA_REG_OFFSET_START 1: start PL traffic, 0: stop PL traffic. Also assigned to the go_bram input in the CRS module.
0x8 [11:0] BLOCK_SIZE W DLBF_DATA_REG_OFFSET_BLOCK_SIZE Sets the block size of the stream frame. TLAST is asserted for every number of cycles. Also assigned to the block_size_bram input in the CRS module.
0xC [11:0] NITER W DLBF_DATA_REG_OFFSET_NITER Sets the number of iterations of the data to go through. If this set to 0, data will be transmitted to the AI Engine forever. Also assigned to the niter_bram input in the CRS module. The bare-metal host applications set this register to 4.
0x10 [15:0] ROLLOVER_ADDR W DLBF_DATA_REG_OFFSET_ROLLOVER When the BRAM address reaches the rollover address, it will reset to 0. Also assigned to the rollover_addr_bram input in the CRS module.
0x20 [0] MASTER_DONE R DLBF_DATA_REG_OFFSET_DONE When this status register is 1'b, the data master is done sending data to the AI Engine. Also assigned to the m0_done_bram input in the CRS module.

All the PL master kernels (dlbf_data, dlfbf_coeffs, ulbf_data, and ulbf_coeffs) also contain multiple PL data masters (BRAMs). The dlbf_data and dlbf_coeffs have four data masters. The ulbf_data and ulbf_coeffs have eight data masters. Each of these data masters also has its own set of CRS registers. The PS host application can access each PL data master register by adding the dlbf_data xparameter + the data master’s offset + the CRS offset + the CRS register offset.

The following table is a list of the dlbf_data data masters’ offsets and the CRS offset:

Register Address Map

Register Space Offset Bits and Name R/W? Global Defines (params.h) Description
0x0000_0000 -- R DLBF_DATA_RAM0_OFFSET Master 0 data offset.
0x0010_0000 -- R DLBF_DATA_RAM1_OFFSET Master 1 data offset.
0x0020_0000 -- R DLBF_DATA_RAM2_OFFSET Master 2 data offset.
0x0030_0000 -- R DLBF_DATA_RAM3_OFFSET Master 3 data offset.
0x0008_0000 -- R DLBF_DATA_CSR_OFFSET CSR offset.

For example, if the PS host application wants to write to the RESET register of data master 0 in the dlbf_data_00 PL kernel, it must write to the following address:

RESET0_ADDR = XPAR_DLBF_DATA_00 + DLBF_DATA_RAM0_OFFSET + DLBF_DATA_CSR_OFFSET + DLBF_DATA_REG_OFFSET_RESET

The rest of the PL master kernels (dlbf_data, dlbf_coeffs, ulbf_data, and ulbf_coeffs) also have similar register address mappings.

The control and status registers of the dlbf_slave PL kernel are shown in the following table.

Register Address Map

Register Space Offset Bits and Name R/W? Global Defines (params.h) Description
0x0 [31:0] ID R DLBF_SLAVE_REG_OFFSET_ID 32-bit ID register.
0x4 [0] RESET W DLBF_SLAVE_REG_OFFSET_RESET 1:assert, 0:de-assert. Also assigned to the slave_rst_bram input in the CRS module.
0xC [11:0] NITER W DLBF_SLAVE_REG_OFFSET_NITER Sets the number of iterations of the data to go through. If this set to 0m, data will be transmitted to the AI Engine forever. Also assigned to the niter_bram input in the CRS module. The main_partial.cpp sets this to 4. The main_full.cpp sets this is TODO.
0x20 [0] SLAVE_DONE R DLBF_SLAVE_REG_OFFSET_DONE When this status register is 1'b, the RAM slave is done receiving data from the AI Engine. Also assigned to the rxdone_bram input in the CRS module.

Each data slave PL kernel (dlbf_slave and ulbf_slave) contain only one RAM slave (URAM). The PS host application can access each RAM slave module by adding the CRS offset (0x0008_0000) to the CRS register offset. For example, to access the NITER register, write to the following address:

NITER_ADDR = DLBF_SLAVE_CSR_OFFSET + DLBF_SLAVE_REG_OFFSET_NITER

The ulbf_slave PL kernel also has the same register address mapping, and its CSR registers are accessed in the same way.