To estimate the design performance during the AI Engine simulation, it is necessary to analyze the profile results carefully. This section walks you through some topics that are most commonly used to assess how your kernel is performing overall.
Refer to the Section 4 Enabling the Profile and Trace Options to understand how to enable profiling in the Vitis IDE.
After running the AI Engine Simulation, open the profile analysis view -> aie_component -> AIE SIMULATOR/HARDWARE -> Run-aie_component -> Profile .
You can click the Summary corresponding to each tile in the landing page, and observe the cycle count, instruction count, and program memory.
Now, under the Function Reports, click the Total Function Time to observe the following table at the bottom for the
data_shufflekernel function took 2,303 cycles for seven iterations, i.e., ~329 cycles for one iteration which is the Avg Function Time.
mainfunction is added by the compiler and different from the
main()function in the
graph.cppfile. This function took 99749 cycles in total which includes the time to transfer control back and forth between each graph iteration, lock stalls, etc.
_main_initruns once for all graph iterations, and it took 26 cycles.
_cxa_finalizefunction took 43 cycles to call the destructors of the global c++ objects.
_finifunction executes the program terminating instructions, and it took 24 cycles.
If you click the AI Engine Simulation Summary, you can notice the AI Engine Frequency as
0.8 nsNow, the data_shuffle function took
329 * 0.8 ~= 264 ns.
Try to match these valuess with the trace information. Click Trace, and zoom in to the period of one iteration (between two
main()function calls as follows), and add a marker and drag it to the end of the kernel function.
The difference between the starting time and end time of the kernel function for one iteration matches with the 264 ns seen in the profiling results.