xrt.ini File - 2022.1 English

Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393)

Document ID
UG1393
Release Date
2022-05-25
Version
2022.1 English
The Xilinx runtime (XRT) library uses various control parameters to specify debugging, profiling, and message logging when running the host application and kernel execution. These control parameters are specified in a runtime initialization file, xrt.ini and used to configure features of XRT at start-up.

If you are a command line user, the xrt.ini file needs to be created manually and saved to the same directory as the host executable.

The runtime library checks if xrt.ini exists in the same directory as the host executable and automatically reads the file to configure the runtime. You can also specify the location of an xrt.ini file at runtime by setting the XRT_INI_PATH environment variable to point to the file, for example:

export XRT_INI_PATH=/path/to/xrt.ini
Tip: The Vitis IDE creates an xrt.ini file automatically based on your run configuration and saves it with the host executable.

Runtime Initialization File Format

The xrt.ini file is a simple text file with groups of keys and their values. Any line beginning with a semicolon (;) or a hash (#) is a comment. The group names, keys, and key values are all case sensitive.

The following is an example xrt.ini file that enables the timeline trace feature, and directs the runtime log messages to the Console view.

#Start of Debug group 
[Debug] 
native_xrt_trace = true
device_trace = fine

#Start of Runtime group 
[Runtime] 
runtime_log = console

There are three groups of initialization keys:

  • Runtime
  • Debug
  • Emulation

The following tables list all supported keys for each group, the supported values for each key, and a short description of the purpose of the key.

Runtime Group

The Runtime group of switches lets you configure elements of the runtime operation as described below.

Key Valid Values Description
api_checks [true|false] Enables or disables OpenCL API checks.
  • true: Enable. This is the default value.
  • false: Disable.
cpu_affinity {N,N,...} Pins all runtime threads to specified CPUs. Example:
cpu_affinity = {4,5,6}
exclusive_cu_context [true|false] This allows the host application to direct OpenCL to acquire exclusive CU access, so that low-level AXI read/write (xclRegRead and xclRegWrite) can be used for regular kernels.
runtime_log [null | console | syslog | <filename>] Specifies where the runtime logs are printed
  • null: Do not print any logs. This is the default value.
  • console: Print logs to stdout
  • syslog: Print logs to Linux syslog.
  • <filename>: Print logs to the specified file. For example, runtime_log=my_run.log.
verbosity [0 | 1 | 2 | 3] Verbosity of the log messages. The default value is 0.

Debug Group

The Debug group of switches define key options for the enabling profiling of the application during runtime, or tracing data transfers and execution. These switches apply to both AI Engine and PL kernels in the Vitis acceleration flow, and let you configure aspects of the runtime to control the frequency of data capture, the events to capture, and the amount of memory to reserve or use for recording trace and profile data.

Table 1. AI Engine Profile and Trace Options
Key Valid Values Description
aie_profile [true|false] Enables the runtime configuration and polling of AI Engine hardware performance counters. Available on VCK190 hardware runs only.
  • true: Enable.
  • false: Disable. This is the default value.
aie_profile_core_metrics [heat_map|stalls|execution|floating_point|write_bandwidths|read_bandwidths|aie_trace]

Controls the configuration of the statistics read from the AI Engine core performance counters.

heat_map: profile active/stall cycles and vector instruction usage

stalls: profile the different types of stalls (i.e., memory, stream, lock, and cascade)

execution: profile the AI Engine instructions

floating_point: profile floating point exceptions

write_bandwidths: profile the write bandwidth of streams and cascades

read_bandwidths: profile the read bandwidths of streams and cascades

aie_trace: profile amount and stalls of event trace from core and memory modules

Note: This switch only has an effect if aie_profile = true.
aie_profile_interface_metrics [bandwidths|stalls_idle]

Controls the configuration of the statistics read from the AI Engine interface tile performance counters

bandwidths: profile the bandwidths of PLIO masters/slaves

stalls_idle: profile the stalls and idle cycles of PLIO masters/slaves

aie_profile_interval_us <int> Controls the interval of reading the AI Engine counter values in microseconds (µs). The default interval is 1000 µs.
Note: This switch only has an effect if aie_profile = true.
aie_profile_memory_metrics [conflicts|dma_locks|dma_stalls_s2mm|dma_stalls_mm2s|write_bandwidths|read_bandwidths]

Controls the configuration of statistics read from the AI Engine memory performance counters

conflicts: profile the DMA memory conflicts

dma_locks: profile DMA locks and stalls on lock acquire

dma_stalls_s2mm: profile stalls on DMA S2MM channels

dma_stalls_mm2s: profile stalls on DMA MM2S channels

write_bandwidths: profile bandwidths of DMA S2MM channels

read_bandwidths: profile bandwidths of DMA MM2S channels

.
Note: This switch only has an effect if aie_profile = true.
aie_status [true|false] Enables the polling of AI Engine status information. Available on VCK190 hardware runs only.
aie_status_interval_us integer (default=1000us) Controls the interval at which AI Engine status information is captured. Specified in microseconds.
aie_trace [true|false] Enables the runtime configuration and collection of AI Engine event trace. Available on VCK190 hardware runs only.
  • true: Enable.
  • false: Disable. This is the default value.
aie_trace_buffer_offload_interval_ms integer (default=10ms) Interval, in milliseconds, between reading of PLIO mode AI Engine trace from device to Host memory.
aie_trace_buffer_size <string> (default=8M) Controls the total size of the buffers allocated for AI Engine event trace. This size is partitioned evenly into the number of different trace streams coming out of the AI Engine. The default is 8M.
Note: This switch only has an effect if aie_trace = true.
aie_trace_file_dump_interval_s integer (default=5s) Interval, in seconds, between writing (appending) of raw AI Engine trace data to output files.
aie_trace_metrics [functions|functions_partial_stalls| function_all_stalls] Controls the configuration of the AI Engine registers to generate a specified level of event trace.
Note: This switch only has an effect if aie_trace = true.
aie_trace_periodic_offload true/false (default=true) Enables continuous offload of PLIO mode AI Engine trace. Generated AI Engine trace output files (one per stream) gets appended with new trace data.
aie_trace_start_time string (i.e., <N>[ns|us|ms|s]) (default=0) Specifies a delay to starting trace in AI Engine clock cycles or time (s,ms,us,ns), e.g., 100, 20.3ms, 16.1us
Table 2. Host and PL Kernel Options
Key Valid Values Description
app_debug [true|false] Enables the OpenCL application debug for the host code when debugging with GDB.
  • true: Enable.
  • false: Disable. This is the default value.
continuous_trace [true|false] Enables the continuous dumping of files for trace and the continuous reading of device data into the host.
  • true: Enable.
  • false: Disable. This is the default value.
Note: This switch only has an effect if device_trace is enabled.
device_counters [true|false] Enables device counter offload only, without enabling trace functionality.
device_trace [off|fine|coarse|accel] Enables the collection of data from monitors inserted on the PL to add to summary and trace.
  • accel: Traces compute unit starts/stops.
  • coarse: Lumps all reads/writes together under each execution of a compute unit.
  • fine: Tracks everything as it happens.
  • off: Turns off reading and reporting of device-level trace during runtime. This is the default value.
host_trace [true|false] Enables trace of host code based on the first protocol encountered.
Tip: If your host application uses both OpenCL and XRT native API you should manually specify both opencl_trace and native_xrt_trace to capture all events.
lop_trace [true|false] Enables generation of lower overhead OpenCL API host trace. Should not be used with other OpenCL options.
  • true: Enable.
  • false: Disable. This is the default value.
native_xrt_trace [true|false] Enables generation of the Native C/C++ API trace. This also generates the tables for "Host Data Transfer from/to Global memory" in the Profile Summary.
  • true: Enable.
  • false: Disable. This is the default value.
opencl_trace [true|false] Enables generation of OpenCL API host trace.
  • true: Enable.
  • false: Disable. This is the default value.
pl_deadlock_detection [true|false] Enables deadlock detection for PL kernels.
power_profile [true|false] Enables the polling of power data during the execution of the application.
  • true: Enable.
  • false: Disable. This is the default value.
Note: This feature is not supported on certain platforms including AWS.
power_profile_interval_ms <int>(default=20) Controls the interval of reading the power counters in milliseconds. The default interval is 20 ms.
Note: This switch only has an effect if power_profile = true.
profile_api [true|false] Enables access to HAL API directly from the host application to read counters on device profiling monitors during execution.
  • true: Enable.
  • false: Disable. This is the default value.
stall_trace [off|all|dataflow|memory|pipe] Specifies the type of device-side stalls to capture and report in the timeline trace. The default is off.
  • off: Turn off stall trace information.

    all: Record all stall trace information.

    dataflow: Intra-kernel streams (for example, writing to full FIFO between dataflow blocks).

    memory: External memory stalls (for example, AXI4 read from the DDR memory).

    pipe: Inter-kernel pipe for OpenCL kernels (for example, writing to full pipe between kernels).

Note: This switch only has an effect if device_trace is enabled.
trace_buffer_offload_interval_ms <int> Controls the reading of device data from the device to the host in milliseconds (ms). The default is 10 ms.
Note: This switch only has an effect if device_trace is enabled.
trace_buffer_size <string> If the .xclbin was created with memory offload of trace specified, as described in --profile Options,this switch determines the size of the buffer to allocate in memory to capture trace data. The default is 1M.
Note: This switch only has an effect if device_trace is enabled.
trace_file_dump_interval_s <int> Controls the time between dumping of trace files in seconds (s). The default is 5s.
Note: This switch only has an effect if device_trace is enabled.
vitis_ai_profile [true|false] Profile summary and other files come from Vitis AI application layer.
  • true: Enable.
  • false: Disable. This is the default value.
xocl_debug [true|false]

Generates the xocl.log file when enabled.

When any trace options are also enabled, the debug log is added to the xrt.run_summary to view in Vitis Analyzer.

xrt_trace [true|false] Enables generation of low-level HW shim function trace during HW runs. This will be disabled when used with native_xrt_trace.
  • true: Enable.
  • false: Disable. This is the default value.

Emulation Group

The Emulation group of switches apply to the emulation environments and the Vivado simulator.

Key Valid Values Description
aliveness_message_interval Any integer Specifies the interval in seconds that aliveness messages need to be printed. The default is 300.
debug_mode [off|batch|gui] Specifies how the waveform is saved and displayed during emulation.
  • off: Do not launch simulator waveform GUI, and do not save wdb file. This is the default value.
  • batch: Do not launch simulator waveform GUI, but save wdb file
  • gui: Launch simulator waveform GUI, and save wdb file
Note: The kernel needs to be compiled with debug enabled (v++ -g) for the waveform to be saved and displayed in the simulator GUI.
kernel-dbg [true|false] Enables kernel debug functionality during software emulation as described in Command Line Debug Flow.
  • true: Enable.
  • false: Disable. This is the default value.
print_infos_in_console [true|false] Controls the printing of emulation info messages to user's console. Emulation info messages are always logged into a file called emulation_debug.log
  • true: Print in user's console. This is the default value.
  • false: Do not print in user console.
print_warnings_in_console [true|false] Controls the printing emulation warning messages to user's console. Emulation warning messages are always logged into a file called emulation_debug.log.
  • true: Print in user's console. This is the default value.
  • false: Do not print in user console.
print_errors_in_console [true|false] Controls printing emulation error messages in user's console. Emulation error messages are always logged into the emulation_debug.log file.
  • true: Print in user's console. This is the default value.
  • false: Do not print in user's console.
user_pre_sim_script Path to Tcl file For the first run, run simulation in GUI mode. Add signals that you want to add. Copy the commands from the Tcl console and save into a Tcl script.

For the next run, pass the Tcl script in batch mode.

user_post_sim_script Path to Tcl file Any post operations can be specified in the Tcl and pass to the switch. All the command provided in the Tcl gets executed after simulation is completed.
xtlm_aximm_log [true|false] Enables the XTLM AXI4 Memory Map transaction logging at runtime and you could see all the transactions in the xsc_report.log file.
xtlm_axis_log [true|false] Enables the XTLM AXI4-Stream transaction logging at runtime and you could see all the transactions in the xsc_report.log file.
timeout_scale na/ms/sec/min Timeout support for clPollStream API in emulation. Provides a scale for the timeout specified in clPollStream API. The timeout specified in the code is specified in ms, and might not work for emulation. Therefore use the timeout_scale to map ms to another scale if needed for emulation.
Important: Timeout is not enabled in emulation by default. Use this option to enable clPollStream timeout.