Running the Design in Hardware and Capturing Trace Data at Run Time - 2020.2 English

Versal ACAP AI Engine Programming Environment User Guide (UG1076)

Document ID
UG1076
Release Date
2020-11-24
Version
2020.2 English

XRT and XSDB are two ways to run the design on the Arm processor in hardware and capture trace data at run time. XRT is supported on the Linux platform, whereas XSDB is supported on both bare metal and Linux platforms. Details of the steps involved in both flows are described as follows.

The Xilinx Software Debugger (xsdb) flow is as follows:

  1. Set up xsdb as described below to connect to the device hardware.

    When running the application, the trace data is stored in DDR memory by the debugging and profiling IP. To capture and evaluate this data, you must connect to the hardware device using xsdb. This command is typically used to program the device and debug bare-metal applications. Connect your system to the hardware platform or device over JTAG, launch the xsdb command in a command shell, and run the following sequence of commands:

    xsdb% connect
    
    xsdb% source $::env(XILINX_VITIS)/scripts/vitis/util/aie_trace_profile.tcl​
    xsdb% aietrace::initialize $PROJECT/xclbin.link_summary 0x800000000 0x80000
    
    # Execute the PS host application (.elf) on Linux
    ## After the application completes processing.
    xsdb% aietrace::offload

    where:

    • connect: Launches the hw_server and connects xsdb to the device.
    • source $::env(XILINX_VITIS)/scripts/vitis/util/aie_trace_profile.tcl​: Sources the Tcl trace command to set up the xsdb environment.
    • aietrace::initialize PROJECT/xclbin.link_summary 0x800000000 0x80000: Initializes the DPA IP to begin capturing trace data. The values 0x800000000 0x80000 specify the starting address to write trace data into the AI Engine and the amount of data to store.
      Important: The DDR memory address used in aietrace::initialize should be a high address to limit any chance of running into memory conflicts with the OS on the xilinx_vck190_base_202020_1 platform or the application. For a custom platform, make sure you know how much DDR memory is being used and plan accordingly.
    • aietrace::offload: Instructs the DPA IP to offload the trace event data from the DDR memory. This command should wait until after the application completes. The data is written to the event_trace<N>.txt file in the current working directory from where xsdb was launched. An aie_trace_profile.run_summary file is also created. It can be opened in the Vitis analyzer as explained in Viewing the Run Summary in the Vitis Analyzer.
      Tip: If you do not remove the event_trace<N>.txt when running the graph again, the old files will be overwritten by the new run results.
  2. Run the design on hardware to trace hardware events.
  3. Offload the captured trace data.
  4. Use the Vitis analyzer to import and analyze data.

The Xilinx Runtime (XRT) flow is as follows:

  1. Burn the generated sd_card.img to the physical SD card.
  2. Create the xrt.ini file in the sd_card folder as described in this section to enable xrt flow.

    An example xrt.ini file is shown in the following.

    [Debug]
    aie_trace=true
    aie_trace_buffer_size=10M
  3. Run the design on hardware to trace hardware events.
  4. Copy the captured trace data from the sd_card folder to your design at same level as the design Work directory. The trace data is generated in the same location as the host application on the SD card.
  5. Use the Vitis analyzer to import and analyze data.

When running the application, the streaming interface between the AI Engine and the System DPA IP (highlighted in orange in the previous image) can become overloaded with event trace data captured from the application. In this case, you might need to increase the number of available streaming channels to capture data with the --num-trace-streams option to the AI Engine compiler.

Note: Implementing the DPA IP in hardware consumes device resources and so can impact the availability of resources for your PL kernels, and other elements of your design. Refer to Using Multiple Event Trace Streams for more information.