In this step, we will see how to asynchronously transfer output data with non-blocking GMIO API, and how to use GMIO::wait
to perform data synchronization. In addition, we will see how to run the AI Engine program with GMIO in hardware.
Change the working directory to single_aie_gmio/step3
. Examine aie/graph.cpp
. The main difference in code is as follows:
gr.gmioIn.gm2aie_nb(dinArray,BLOCK_SIZE_in_Bytes);//Transfer all blocks input data at a time
gr.run(ITERATION);
gr.gmioOut.aie2gm_nb(doutArray,BLOCK_SIZE_in_Bytes);//Transfer all blocks output data at a time
//PS can do other tasks here when data is transferring
gr.gmioOut.wait();
Note: gr.gmioOut.aie2gm_nb()
will return immediately after it has been called without waiting for the data transfer to be completed. PS can do other tasks after non-blocking API call when data is transferring. Then, it needs gr.gmioOut.wait();
to do the data synchronization. After GMIO::wait
, the output data is in memory and can be processed by the host application.
To make GMIO work in hardware flow, the following code needs to be added to the main function before graph execution and GMIO data transfer:
#if !defined(__AIESIM__) && !defined(__X86SIM__)
#include "adf/adf_api/XRTConfig.h"
#include "experimental/xrt_kernel.h"
// Create XRT device handle for ADF API
char* xclbinFilename = argv[1];
auto dhdl = xrtDeviceOpen(0);//device index=0
xrtDeviceLoadXclbinFile(dhdl,xclbinFilename);
xuid_t uuid;
xrtDeviceGetXclbinUUID(dhdl, uuid);
adf::registerXRT(dhdl, uuid);
#endif
The macro __AIESIM__
is automatically defined by tool when running AI Engine simulator. The macro __X86SIM__
is automatically defined by tool when running x86simulator. Using the guard macro __AIESIM__
and __X86SIM__
as shown in the previous code, this part of code is not used in AI Engine simulator, but used in hardware flow and hardware emulation flow to make those flows work correctly.
At the end of the program, close the device using the XRT API xrtDeviceClose()
.
#if !defined(__AIESIM__) && !defined(__X86SIM__)
xrtDeviceClose(dhdl);
#endif