Stream Stall Analysis - 2022.2 English

AI Engine Tools and Flows User Guide (UG1076)

Document ID
UG1076
Release Date
2022-10-19
Version
2022.2 English

From the Performance Metrics view, you can identify if a stream stall needs to be analyzed, and also the tile/tiles that are causing the stall.

The following steps illustrate how a stream stall can be analyzed starting in the Performance Metrics tab in Vitis Analyzer.

  1. In the Performance Metrics view, select Stream Stall Time (%) to view stream stalls across all tiles. Identify the tile(s) to be analyzed. Note that the objects in Performance Metrics view can be cross-probed with Trace view, Graph view, and Array view. For example, selecting the tile in the Performance Metrics view to highlight the tile in Trace view can help quickly locate the tile.
    Figure 1. Stream Stall in Performance Metrics View

  2. Select the Trace view.
    Figure 2. Stream Stall in Trace View

  3. From the Stalls view, select Stream Stalls from the drop-down list. In the Stalls view, stream stalls have the following information and the objects in blue can be cross-probed with other views by clicking on it.
    Stall ID
    The stream stall is named SS_<NUM>. The earlier the stall happens, the smaller the number. The number is unique across all types of stalls.
    Stalled Tile
    The AI Engine tile where the stalled kernel is located.
    Stalled Kernel
    The kernel that is stalled. It is named <Kernel_function_name>.<Schedule_ID>.<Graph_instance_name>. Sometimes it is shown as _main and then cross-probe is required to find the real kernel function.
    Start (ns)
    The start time of the stall.
    Duration (ns)
    The duration of the stall.
    PC
    Program counter when the stall happens.
    Stalled Port
    The port of the stalled kernel.
    Related Stalls
    Other stalls that can cause the stall.
    Full Destination Port
    The port that the stalled kernel cannot write into because it is full.
    Empty Source Port
    The port that the stalled kernel cannot read from because it is empty.
  4. Clicking on a stream stall in Stalls view will go to the start of the stall in Trace view. Right-click the stall and select Filter Trace as needed. After filtering trace, the signals related to the stall are shown in the Trace view. Non-related signals are hidden. Exploring the trace using filter trace is clearer when the design is large.
  5. Objects in blue in the Stalls view can be clicked and cross-probed. For example, clicking the kernel in Stalls view will highlight the kernel in Trace view.
  6. Zoom in and out of the Trace view to explore the stalls. From the position of the stall, how frequently do similar stalls occur, events before the stall and related stalls (if any), can give you a hint on why the stall occurs.
  7. To clear the previously filtered trace, right-click and select Clear All Filters.
  8. It is helpful to have an overview of the stall path in Graph view. Select Graph view and then select Tile View from the drop-down list.
    Figure 3. Stall Path in Graph View

  9. Select Stalls view and then select Stream Stalls from the drop-down list.
  10. Explore the stream stalls in the Stalls view. Click on a stream stall in the Stalls view to have an overview of the stall in the graph. The red path shows where the stall occurs. It can be from a stalled kernel to full destination port, or from an empty source port to the stalled kernel.
    Tip: If a stream multicasts to multiple destinations and a stream stall occurs when the stream does not have enough FIFO for all destinations, the highlighted stalled kernel and stalled net may not be connected (separately in red). Meaning that all destinations of the multicast stream as a whole must be analyzed for the stream stall. An example of multicast stream stall in the Graph view is shown in the following figure.
    Figure 4. Multicast Stream Stall Path

  11. It can open graph source code or kernel source code from Graph view or Array view. Select the kernel instance by clicking the kernel object in the Stalls view or clicking the kernel in the Graph view.
    Figure 5. Viewing Graph Code and Kernel Code

  12. Right-click the kernel instance in the Graph view, and select either Goto Graph Source or Goto Kernel Source. It will open the graph source code or kernel source code.
  13. Correlate the graph source code and kernel source code with the stalls analyzed, editing the source code as needed.

The following table lists some possible scenarios that can cause a stream stall and possible solutions.

Table 1. Stream Stall Scenarios and Solutions
Source Destination Stall Type Possible Solution Notes
Stream Stream Stream stall
  • Increase FIFO depth. See FIFO Depth Constraints in AI Engine Kernel and Graph Programming Guide (UG1079).
  • Adjust stream read and write instructions in source or destination kernels.
Stream Multiple streams Stream stall Multicast
Stream Multiple streams of multiple kernels in same AI Engine Stream stall
  • Put multiple kernels into different AI Engines
  • Add enough FIFO to streams to the kernels.
Multicast
Multiple streams Multiple streams Stream stall
  • Adjust instructions to match between different streams.
  • Increase FIFO depth (ssFIFO or DMA FIFO).
 
PLIO Stream Stream stall
  • Maximize AI Engine-PL interface bandwidth. For example, 64-bit interface, highest frequency (1/2 AI Engine frequency) for PL. Or 128-bit interface (note this uses two 64-bit channels for a 128-bit interface). See AI Engine-PL Interface Performance in AI Engine Kernel and Graph Programming Guide (UG1079).
Stream PLIO Stream stall Same as above.
Stream (32 bits per iteration) PLIO Stream stall Send TLAST for each 32 bits. See AI Engine-PL Interface Performance in AI Engine Kernel and Graph Programming Guide (UG1079).