Memory Stalls - 2023.2 English

AI Engine Tools and Flows User Guide (UG1076)

Document ID
UG1076
Release Date
2023-12-04
Version
2023.2 English

The objective of the mapper is to prevent buffer conflicts, where possible. It also has different buffer optimization levels that try to bloat or increase the size of buffers to prevent conflicts. These buffer optimization levels range from 0 (default) to 9. They are invoked with the Xmapper option, --Xmapper=BufferOptLevel<level num>. At the highest buffer optimization level (9), no two buffers can be placed in the same bank. However it is important to know that at the higher buffer optimization levels it might become impossible for the mapper to find a solution and it will error out. So the first option if you see a large number of memory stalls is to cycle through the BufferOptLevel options to see if fewer memory stalls are seen at higher bufferOptLevels.

Another possibility is that you can explicitly inform the mapper not to place two buffers in the same bank. If your simulation analysis indicates that a significant throughput degradation is caused by memory stall resulting from a bank conflict between buffer kernel_0.in[0] and kernel_1.out[0], you can provide a directive to the mapper to not place these buffers in the same bank as follows.

not_equal(location<buffer>(kernel_0.in[0]), location<buffer>(kernel_1.out[0]));
Note: While the buffers can be placed in different banks, they might not be on the same AI Engine tile.
If DMA FIFOs are used in the design and they are placed in the same bank as other buffers then the Xrouter option DMAFIFOsInFreeBankOnly can force the router to place these FIFOs in free banks. This eliminates memory conflicts with the DMA FIFOs. If it is not possible to reserve an entire free bank for the DMA FIFO then location constraints can be used in coordination with outside knowledge of memory buffers. In this case it is important to have knowledge of which buffers might cause stalls when conflicting with DMA FIFOs. The constraints can look like the following.
location<fifo>(net2) = { dma_fifo(aie_tile, 15, 0, 0x3100, 32) };