To estimate the design performance during the AI Engine simulation, it is necessary to analyze the profile results carefully. This section walks you through some topics that are most commonly used to assess how your kernel is performing overall.
Refer to the Section 4 Enabling the Profile and Trace Options to understand how to enable profiling in the Vitis IDE.
After running the AI Engine Simulation, open the profile analysis view -> aie_component -> AIE SIMULATOR/HARDWARE -> Run-aie_component -> Profile .
You can click the Summary corresponding to each tile in the landing page, and observe the cycle count, instruction count, and program memory.
Now, under the Function Reports, click the Total Function Time to observe the following table at the bottom for the
data_shuffle
kernel function.The
data_shuffle
kernel function took 2,303 cycles for seven iterations, i.e., ~329 cycles for one iteration which is the Avg Function Time.The
main
function is added by the compiler and different from themain()
function in thegraph.cpp
file. This function took 99749 cycles in total which includes the time to transfer control back and forth between each graph iteration, lock stalls, etc.The
_main_init
runs once for all graph iterations, and it took 26 cycles.The
_cxa_finalize
function took 43 cycles to call the destructors of the global c++ objects.The
_fini
function executes the program terminating instructions, and it took 24 cycles.
If you click the AI Engine Simulation Summary, you can notice the AI Engine Frequency as
1250 MHz
i.e.,0.8ns
, i.e.,1
cycle =0.8 ns
Now, the data_shuffle function took329
cycles for1
iteration, i.e.,329 * 0.8 ~= 264 ns
.Try to match these valuess with the trace information. Click Trace, and zoom in to the period of one iteration (between two
main()
function calls as follows), and add a marker and drag it to the end of the kernel function.The difference between the starting time and end time of the kernel function for one iteration matches with the 264 ns seen in the profiling results.