Profiling for Memory Modules - 2022.1 English

Versal ACAP AI Engine Programming Environment User Guide (UG1076)

Document ID
UG1076
ft:locale
English (United States)
Release Date
2022-05-25
Version
2022.1 English

The following tables list the pre-defined metric set configurations available for memory modules. In the xrt.ini file all of these metric names should be in lower case and assigned to the metric selector aie_profile_memory_metrics.

Table 1. conflicts
Metric Name Event ID Description
Memory Conflict 76 Time taken due to data memory conflicts on any of the 8 banks of the memory module.
Note: The hardware view is 8 banks of 128 bit width. The software view is 4 banks of 256 width.
Cumulative Memory Errors 86 Time taken due to ECC errors in any of the Data Memory banks, as well as the 2x MM2S and the 2x S2MM DMAs.

Memory conflicts happen when two memory chunks reside in the same memory bank and are accessed either by the same AI Engine (using the two read ports) or by two different AI Engines. A potential solution is to constrain the locations of these memories to different banks. In order to get more details about which bank is causing these conflicts, you should analyze the events from an emulation-AI Engine simulation.

Table 2. dma_locks
Metric Name Event ID Description
Cumulative DMA Activity 20 Time taken due to stalled lock acquires on both the MM2S and S2MM channels of the DMA.
Cumulative DMA Lock Count 43 Lock Stall count on the DMA channels.

The four DMA channels (2xS2MM and 2xMM2S) are driven by Buffer Descriptors (BDs). The Cumulative DMA Activity is a count of the time taken due to stalled lock acquire events on all channels. All these DMA events will help you understand why some connections through the device are slower than expected.

Table 3. dma_stalls_s2mm
Metric Name Event ID Description
S2MM Channel 0 Stalls 33 Time S2MM channel 0 was stalled on lock acquire.
S2MM Channel 1 Stalls 34 Time S2MM channel 1 was stalled on lock acquire
Table 4. dma_stalls_mm2s
Metric Name Event ID Description
MM2S Channel 0 Stalls 35 Time the MM2S channel 0 stalled on lock acquire.
MM2S Channel 1 Stalls 36 Time the MM2S channel 1 stalled on lock acquire.
Table 5. write_bandwidths
Metric Name Event ID Description
DMA S2MM Channel 0 Packet Count 25 Number of packets written over DMA S2MM channel 0.
DMA S2MM Channel 1 Packet Count 26 Number of packets written over DMA S2MM channel 1.
Bandwidth of DMA S2MM channel 0 Derived Write bandwidth over DMA S2MM Channel 0. This bandwidth is computed with respect to the active time.
Bandwidth of DMA S2MM channel 1 Derived Write bandwidth over DMA S2MM Channel 1. This bandwidth is computed with respect to the active time.

These metrics allow you to understand how efficiently the DMA S2MM are used.

Note: This metric set does not give usable results if the DMA S2MM is used in DMA FIFO mode.
Table 6. read_bandwidths
Metric Name Event ID Description
DMA MM2S Channel 0 Packet Count 27 Number of packets read from DMA MM2S channel 0.
DMA MM2S Channel 1 Packet Count 28 Number of packets read from DMA MM2S channel 1.
Bandwidth of DMA MM2S channel 0 Derived Read bandwidth over DMA MM2S Channel 0. This bandwidth is computed with respect to the active time.
Bandwidth of DMA MM2S channel 1 Derived Read bandwidth over DMA MM2S Channel 1. This bandwidth is computed with respect to the active time.

These metrics allow you to understand how efficiently the DMA MM2S are used.

Note: This metric set does not give usable results if the DMA MM2S is used in DMA FIFO mode.