As discussed in Enabling Profiling in Your Application,
there are a number of
--profile options that let you
enable profiling of the application and kernel events during runtime execution. This
option enables capturing transaction details for data traffic between the kernel and
host, kernel stalls, the execution times of kernels and compute units (CUs), as well as
monitoring activity in Versal
v++also requires the addition of one of the profile or trace options in the xrt.ini file. Refer to xrt.ini File for more information.
--profilecommands can be specified in a configuration file under the
[profile]section head using the following format, for example:
[profile] data=all:all:all # Monitor data on all kernels and CUs data=k1:all:all # Monitor data on all instances of kernel k1 data=k1:cu2:port3 # Specific CU master data=k1:cu2:port3:counters # Specific CU master (counters only, no trace) memory=all # Monitor transfers for all memories memory=<sptag> # Monitor transfers for the specified memory stall=all:all # Monitor stalls for all CUs of all kernels stall=k1:cu2 # Stalls only for cu2 exec=all:all # Monitor execution times for all CUs exec=k1:cu2 # Execution tims only for cu2 aie=all # Monitor all AIE streams aie=DataIn1 # Monitor the specific input stream in the SDF graph aie=M02_AXIS # Monitor specific stream interface
The various options of the command are described below:
Enables profiling of AI Engine
streams in adaptive data flow (ADF) applications, where
<ADF_graph_argument>: Specifies an argument name from the ADF graph application.
<pin_name>: Indicates a port on an AI Engine kernel.
<all>: Indicates monitoring all stream connections in the ADF application.
DataIn1input stream use the following command:
v++ --link --profile.aie:DataIn1
Enables monitoring of data ports through monitor IP that are added into the design. This option needs to be specified during linking.
[<kernel_name>|all]defines either a specific kernel to apply the command to. However, you can also specify the keyword
allto apply the monitoring to all existing kernels, compute units, and interfaces with a single option.
<kernel_name>has been specified, you can also define a specific CU to apply the command to, or indicate that it should be applied to all CUs for the kernel.
[<interface_name>|all]defines the specific interface on the kernel or CU to monitor for data activity, or monitor all interfaces.
[<counters|all]is an optional argument, as it defaults to
allwhen not specified. It allows you to restrict the information gathering to just
countersfor larger designs, while
allwill include the collection of actual trace information.
For example, to assign the data profile to all CUs and interfaces
k1 use the following command:
v++ --link --profile.data:k1:all:all
This option records the execution times of the kernel and provides minimum port data collection during the system run. This option needs to be specified during linking.
--profile.stallis specified. You can specify
--profile.execfor any CUs not covered by
The syntax for
For example, to profile to execution of
cu2 for kernel
k1 use the following
v++ --link --profile.exec:k1:cu2
v++compilation and linking.
Adds stall monitoring logic to the device binary (.xclbin) which requires the addition of stall ports
on the kernel interface. To facilitate this, the
stall option must be specified during both compilation and
The syntax for
For example, to monitor stalls of
cu2 for kernel
k1 use the following
v++ --compile -k k1 --profile.stall ... v++ --link --profile.stall:k1:cu2 ...
-t=hw) only, and should not be used for software or hardware emulation flows.
When building the hardware target (
-t=hw), use this option to specify the type and amount of memory to
use for capturing trace data. You can specify the argument as follows:
This argument specifies memory type to use for capturing trace
data. Use the
--profile.trace_memory command to
define the type or memory to use, with the
trace_buffer_size switch in the xrt.ini file to define the amount of memory to use as described in
xrt.ini File. The default memory type
used is the first memory defined in the platform, and the default buffer size is 1
trace_memory is not
device_trace is enabled in the
xrt.ini File, the profile data is
captured to the default platform memory with 1 MB allocated for the trace buffer.
- Specified in KB. The maximum is 128K, although 64K is the maximum recommended.
- Specifies the type and number of memory resource on the
platform. Memory resources for the target platform can be identified with
platforminfocommand. Supported memory types include HBM, DDR, PLRAM, HP, ACP, MIG, and MC_NOC. For example,
- Optionally indicates that CUs assigned to the specified
<SLR>should use the DDR or HBM resources specified in the
<MEMORY>field. Note that this syntax can only be used with DDR or HBM memory banks.
You can specify the
--profile.trace_memory command with just the memory size and unit such
FIFO:8k, or specify the memory bank such as
HBM. In this case, the profile data for all CUs are captured in the
Or you can specify the memory to use to capture profile data, and the SLR assignment for that memory. In this case, the SLR assignment indicates that any CUs assigned to the specified SLR should have profile data captured in the specified memory. This is shown in the following config file example:
[profile] trace_memory=DDR:SLR0 trace_memory=DDR:SLR1
In the example above, profile data for CUs assigned to SLR0 are
captured in DDR bank 1, and CUs assigned to SLR1 are captured in DDR bank 2. CUs are
assigned to SLRs using the
command as described in Assigning Compute Units to SLRs.
<SLR>syntax you must use that syntax for all
trace_memorycommands in the design.