Step 3 - Asynchronous GMIO Transfer and Hardware Flow - 2022.2 English

Vitis Tutorials: AI Engine Development

Document ID
XD100
Release Date
2022-12-01
Version
2022.2 English

In this step, we will see how to asynchronously transfer output data with non-blocking GMIO API, and how to use GMIO::wait to perform data synchronization. In addition, we will see how to run the AI Engine program with GMIO in hardware.

Change the working directory to single_aie_gmio/step3. Examine aie/graph.cpp. The main difference in code is as follows:

gr.gmioIn.gm2aie_nb(dinArray,BLOCK_SIZE_in_Bytes);//Transfer all blocks input data at a time
gr.run(ITERATION);
gr.gmioOut.aie2gm_nb(doutArray,BLOCK_SIZE_in_Bytes);//Transfer all blocks output data at a time
//PS can do other tasks here when data is transferring
gr.gmioOut.wait();

Note: gr.gmioOut.aie2gm_nb() will return immediately after it has been called without waiting for the data transfer to be completed. PS can do other tasks after non-blocking API call when data is transferring. Then, it needs gr.gmioOut.wait(); to do the data synchronization. After GMIO::wait, the output data is in memory and can be processed by the host application.

To make GMIO work in hardware flow, the following code needs to be added to the main function before graph execution and GMIO data transfer:

#if !defined(__AIESIM__) && !defined(__X86SIM__)
	#include "adf/adf_api/XRTConfig.h"
	#include "experimental/xrt_kernel.h"
	// Create XRT device handle for ADF API
	char* xclbinFilename = argv[1];
	auto dhdl = xrtDeviceOpen(0);//device index=0
	xrtDeviceLoadXclbinFile(dhdl,xclbinFilename);
	xuid_t uuid;
	xrtDeviceGetXclbinUUID(dhdl, uuid);
	adf::registerXRT(dhdl, uuid);
#endif

The macro __AIESIM__ is automatically defined by tool when running AI Engine simulator. The macro __X86SIM__ is automatically defined by tool when running x86simulator. Using the guard macro __AIESIM__ and __X86SIM__ as shown in the previous code, this part of code is not used in AI Engine simulator, but used in hardware flow and hardware emulation flow to make those flows work correctly.

At the end of the program, close the device using the XRT API xrtDeviceClose().

#if !defined(__AIESIM__) && !defined(__X86SIM__)
	xrtDeviceClose(dhdl);
#endif