Use the following command to run an AI Engine simulator:
make aiesim
Open the running result by accessing the following directory:
vitis_analyzer aiesimulator_output/default.aierun_summary
Note: The
--dump-vcd
option is used byaiesimulator
to dump the VCD file for trace.Click Trace view in Vitis Analyzer. Zoom in to view the first few runs of the design:
This figure lists the following events and their dependencies:
1: Tile 24_0 DMA s2mm channel 0 (
s2mm.Ch0.BD0.lock0
) starts. It acquires the lock of ping of input buffer (buf0
) toaie_dest1
and transfers data from the PL tobuf0
. Refer to Graph View for the position of the buffer in the graph.2: After DMA s2mm channel 0 BD 0 completes, DMA s2mm channel 0 BD 1 (
s2mm.CH0.BD1.lock1
) starts. It acquires the lock of pong of input buffer (buf0d
) and transfers data from the PL tobuf0d
.3a: The
aie_dest1
kernel (in tile 25_0) acquires the lock ofbuf0
(shown asread lock allocated
).3b:
aie_dest1
acquires the lock of ping of output buffer (buf1
) as well.4a: After
aie_dest1
acquires the locks of its input buffer (buf0
) and output buffer (buf1
), it starts. If any lock cannot be acquired, it will run into lock stall.4b: After tile 24_0 DMA s2mm channel 0 BD 1 (
s2mm.CH0.BD1.lock1
) completes, it switches back to DMA s2mm channel 0 BD 0 (s2mm.CH0.BD1.lock0
). At first,buf0
is still read byaie_dest1
(read lock allocated
), so it sticks atDMA lock req
in red. After the read lock of the buffer is released, it acquires the lock and starts data transfer from PL.5: After
aie_dest1
completes, it releases the output buffer (buf1
). The kernelaie_dest2
acquires the lock ofbuf1
(read lock allocated
).6a: After the lock of
buf1
is acquired,aie_dest2
starts.6b:.
aie_dest1
acquires the lock ofbuf0d
(read lock allocated
).6c:
aie_dest1
acquires the lock ofbuf1d
(write lock allocated
).7: After
aie_dest1
acquires the locks of its input buffer (buf0d
) and output buffer (buf1d
), it starts.8: After
aie_dest1
completes, it releases the output buffer (buf1d
). Kernelaie_dest2
acquires the lock ofbuf1d
(read lock allocated
).9: After the lock of
buf1d
is acquired,aie_dest2
starts.Note: The stream interface does not need to acquire lock; it has inherent an backward and forward pressure for data synchronization. Every lock acquires and releases event has some cycles of overhead.