Use the following command to run AI Engine simulator:
make aiesim
Open the running result by accessing the following directory:
vitis_analyzer aiesimulator_output/default.aierun_summary
Note: The
--dump-vcd
option is used byaiesimulator
to dump the VCD file for trace.Click Trace view in Vitis Analyzer. Zoom in to view the first few runs of the design:
This figure lists the following events and their dependencies:
1: Tile 24_0 DMA s2mm channel 0 (
s2mm.Ch0.BD0.lock0
) starts. It acquires the lock of ping of input buffer (buf0
) toaie_dest1
and transfers data from the PL tobuf0
. Refer to Graph View for the position of the buffer in the graph.2: After DMA s2mm channel 0 completes, DMA s2mm channel 1 (
s2mm.CH1.BD1.lock1
) starts. It acquires the lock of pong of input buffer (buf0d
) and transfers data from the PL tobuf0d
.3a: The
aie_dest1
kernel (in tile 25_0) acquires the lock ofbuf0
(shown asread lock allocated
).3b:
aie_dest1
acquires the lock of ping of output buffer (buf1
) as well.4a: After
aie_dest1
acquires the locks of its input buffer (buf0
) and output buffer (buf1
), it starts. If any lock cannot be acquired, it will run into lock stall.4b: After tile 24_0 DMA s2mm channel 1 (
s2mm.CH1.BD1.lock1
) completes, it switches back to DMA s2mm channel 0 (s2mm.CH0.BD1.lock0
). At first,buf0
is still read byaie_dest1
(read lock allocated
), so it sticks atDMA lock req
in red. After the read lock of the buffer is released, it acquires lock and starts data transfer.5: After
aie_dest1
completes, it releases the output buffer (buf1
). The kernelaie_dest2
acquires the lock ofbuf1
(read lock allocated
).6a: After the lock of
buf1
is acquired,aie_dest2
starts.6b:.
aie_dest1
acquires the lock ofbuf0d
(read lock allocated
).6c:
aie_dest1
acquires the lock ofbuf1d
(write lock allocated
).7: After
aie_dest1
acquires the locks of its input buffer (buf0d
) and output buffer (buf1d
), it starts.8: After
aie_dest1
completes, it releases the output buffer (buf1d
). Kernelaie_dest2
acquires the lock ofbuf1d
(read lock allocated
).9: After the lock of
buf1d
is acquired,aie_dest2
starts.Note: The stream interface does not need to acquire lock; it has inherent backward and forward pressure for data synchronization. Every lock acquire and release event has some cycles of overhead.