Profiling using Performance Counters - 2020.2 English

Versal ACAP AI Engine Programming Environment User Guide (UG1076)

Document ID
UG1076
Release Date
2020-11-24
Version
2020.2 English

You can compile your AI Engine design with performance counters that can be read and collected at run time while the design is executing in hardware. These counters are programmed in the hardware to gather the following statistics for each active AI Engine in your design:

  1. Active Cycles – the total clock cycles that a tile has been activated
  2. Stall Cycles – the total clock cycles that a tile has stalled in one of four ways: memory, stream, cascade, and lock
To enable this feature, you need to specify a compile time option as well as turn it on at run time. To compile the performance counters into your design, use the following option.
aiecompiler –aie-heat-map

When these counters are in your design, you can turn on their capture at run time using the following code in an xrt.ini file.

[Debug]
aie_profile = true

The data can then be viewed and analyzed using the Vitis analyzer in a few different ways, including heat map, histogram, and profile summary. Analyzing this profile will help you determine the active and stall times associated with each AI Engine, and pinpoint the AI Engine whose performance might not be optimal as the design runs on hardware. The following sections include a more detailed description of the two profile views supported in the Vitis analyzer.