Profiling the AI Engine, Memory Modules and Interface Tiles - 2022.1 English

Versal ACAP AI Engine Programming Environment User Guide (UG1076)

Document ID
Release Date
2022.1 English

There are three types of performance counters, run-time event performance counters for the AI Engine modules, run-time memory counters for memory modules and run-time interface counters for AI Engine-PL interface tiles. These performance counters can be configured to track a variety of events in the AI Engine, the memory module and the interface tile. Various features like error-correction code (ECC) scrubbing, event trace and profiling can use these performance counters. Performance counters count occurrences of a given event in a profile configuration. The profile feature offers several different configurations of these performance counters that can be dynamically applied at run-time to collect various profiling statistics.

No changes are required in PS host code when using performance counters. These counters can be configured, read and collected at run-time while the design is executing in hardware. The following table lists the number of performance counters that are available at different configurations.

Table 1. Available Performance Counters
Event Trace Used? ECC Scrubbing Used? Counters Available for Profiling
Core Module Memory Module PL Interface
No No 4 2 2
No Yes 3 2 2
Yes No 3 1 2
Yes Yes 2 1 2

The ECC scrubbing is ON by default and it can be turned ON/OFF using the AI Engine compiler option. For more information, see AI Engine Compiler Options. When ECC scrubbing is enabled, three counters are available for profiling.

When performance counters are used for ECC scrubbing, event trace and profiling in the same execution, allocated performance counters cannot meet the requirements of all the requested features at the same time. The following warning messages indicate this situation.

Figure 1. Warning Message

All the chosen metric sets should be grouped in the xrt.ini file under the [Debug] keyword. An example of the xrt.ini file to profile the AI Engine module, the memory module and the PL interface tile is as follows:
aie_profile = true
aie_profile_interval_us = 1000
aie_profile_core_metrics = heat_map
aie_profile_memory_metrics = write_bandwidths
aie_profile_interface_metrics = input_stalls_idle:2