It can be defined as the average number of bytes produced (or consumed) per second:
To profile the design and calculate the port throughput, you should add the APIs in the host code.
The code changes to profile the design for port throughput calculation are available in
Hardware/src/host_PortTP.cpp
. You can either do changes insw/host.cpp
manually by referring toHardware/src/host_PortTP.cpp
, or replace thesw/host.cpp
. Make sure to take the backup of the file before replacing.In the
Hardware/src/host_PortTP.cpp
, the changes to profile the design are summarized as follows:a. Notice in the
host.cpp
, it contains only Native XRT APIs and no ADF APIs are used. For example, a graph handle is created using theuuid
ofxclbin
and extracted the graph details using thexrt::graph
API.auto cghdl = xrt::graph(device,xclbin_uuid,"mygraph");
b. Also note the graph run and end commands uses graph handle.
cghdl.run(NIterations); cghdl.end();
c. The
Hardware/src/host_PortTP.cpp
file contains the ADF APIs. This change from native XRT APIs to ADF APIs is required to profile the AI Engine design.adf::registerXRT(dhdl, xclbin_uuid.get()); std::cout<<"Register XRT"<<std::endl; const int buffer_sizeIn_bytes = 512; event::handle handle = event::start_profiling(mygraph.out0,event::io_stream_start_to_bytes_transferred_cycles,buffer_sizeIn_bytes*NIterations); if(handle==event::invalid_handle){ printf("ERROR:Invalid handle. Only two performance counter in a AIE-PL interface tile\n"); return 1; } mygraph.run(NIterations); mygraph.end(); ... ... s2mm_1_rhdl.wait(); long long cycle_count = event::read_profiling(handle); std::cout<<"cycle count:"<<cycle_count<<std::endl; event::stop_profiling(handle);//Performance counter is released and cleared double throughput = (double)buffer_sizeIn_bytes*NIterations / (cycle_count * 0.8* 1e-3); //bytes per second std::cout<<"Throughput of the graph: "<<throughput<<" MB/s"<<std::endl;
Also, do the necessary changes to the
Makefile
, so that the compilation and linking of the host code is successful considering the ADF APIs are included. Recommend to replace theMakefile
with theMakefile.host_profile
. Make sure to take a backup of the original file before replacing.Do
make host
andmake package TARGET=hw
to generate the modified hardware,sd_card.img
.Program the device with the new hardware image, and observe the following message in the Linux console that prints the througput of the port
out0
:run mm2s run s2mm Register XRT graph run graph end After MM2S wait After S2MM_1 wait cycle count:2965 Throughput of the graph: 1510.96 MB/s
Note that the throughput value you got above matches with the value you got during AIE Simulation and Hardware Emulation