AI Engine Compiler Options - 2021.2 English

Versal ACAP AI Engine Programming Environment User Guide (UG1076)

Document ID
UG1076
Release Date
2021-12-17
Version
2021.2 English
Table 1. AI Engine Options
Option Name Description
--constraints=<string> Constraints (location, bounding box, etc.) can be specified using a JSON file. This option lets you specify one or more constraint files.
--heapsize=<int> Heap size (in bytes) used by each AI Engine

The stack, heap, and sync buffer (32 bytes, includes the graph run iteration number information) are allocated up to 32768 bytes of data memory. The default heap size is set to 1024 bytes. Before changing the heap size to a different value, ensure that the sum of the stack, heap, and sync buffer sizes does not exceed 32768 bytes.

Used for allocating any remaining file-scoped data that is not explicitly connected in the user graph.

--stacksize=<int> Stack size (in bytes) used by each AI Engine

The stack, heap, and sync buffer (32 bytes) are allocated up to 32768 bytes of data memory. The default stack size is set to 1024 bytes. Before changing the stack size to a different value, ensure that the sum of the stack, heap, and sync buffer sizes does not exceed 32768 bytes.

Used as a standard compiler calling convention including stack-allocated local variables and register spilling.

--pl-freq=<value> Specifies the interface frequency (in MHz) for all PLIOs. The default frequency is a quarter of the AI Engine frequency and the maximum supported frequency is half of the AI Engine frequency. The PL frequency specific to each interface is provided in the graph.
--pl-register-threshold=<value> Specifies the frequency (in MHz) threshold for registered AI Engine-PL crossings. The default frequency is one-eighth of the AI Engine frequency dependent on the specific device speed grade.
Note: Values above a quarter of the AI Engine array frequency are ignored, and a quarter is used instead.
Table 2. CDO Options
Option Name Description
--enable-ecc-scrubbing Enable ECC Scrubbing on all the AI Engines used. This option enables ECC Scrubbing when generating the AI Engine ELF CDO. (One performance counter per core is used.) ECC Scrubbing is turned ON (true) by default.
Table 3. Compiler Debug Options
Option Name Description
--adf-api-log-level=<value> ADF API log-level (0: errors, 1: level-0 + warnings, 2: level-1 + info messages, 3: level-2 + debug messages) (default: 2)
--kernel-linting Perform consistency checking between graphs and kernels. The default is false.
--known-tripcount Converting unknown trip count to known trip count.
--quiet Suppress the output of the AI Engine compiler.
--verbose Verbose output of the AI Engine compiler emits compiler messages at various stages of compilation. These debug and tracing logs provide useful messages regarding the compilation process.
Table 4. Execution Target Options
Option Name Description
--target=<hw|x86sim> The AI Engine compiler supports several build targets (default: hw):
  • The hw target produces a libadf.a for use in the hardware device on a target platform.
  • The x86sim target compiles the code for use in the x86 simulator as described in x86 Functional Simulator.
Table 5. File Options
Option Name Description
--include=<string> This option can be used to include additional directories in the include path for the compiler front-end processing.

Specify one or more include directories.

--output=<string> Specifies an output.json file that is produced by the front end for an input data flow graph file. The output file is passed to the back-end for mapping and code generation of the AI Engine device. This is ignored for other types of input.
--platform=<string>

This is a path to a Vitis platform file that defines the hardware and software components available when doing a hardware design and its RTL co-simulation.

--workdir=<string>

By default, the compiler writes all outputs to a sub-directory of the current directory, called Work. Use this option to specify a different output directory.

Table 6. Generic Options
Option Name Description
--help List the available AI Engine compiler options, sorted in the groups listed here.
--help-list Display an alphabetic list of AI Engine compiler options.
--version Display the version of the AI Engine compiler.
Table 7. Miscellaneous Options
Option Name Description
--no-init This option disables initialization of window buffers in AI Engine data memory. This option enables faster loading of the binary images into the SystemC-RTL co-simulation framework.
Tip: This does not affect the statically initialized lookup tables.
--nodot-graph By default, the AI Engine compiler produces .dot and .png files by default to visualize the user-specified graph and its partitioning onto the AI Engines. This option can be used to eliminate the dot graph output.
Table 8. Module Specific Options
Option Name Description
--Xchess=<string> Can be used to pass kernel specific options to the CHESS compiler that is used to compile code for each AI Engine.

The option string is specified as <kernel-function>:<optionid>=<value>. This option string is included during compilation of generated source files on the AI Engine where the specified kernel function is mapped.

--Xelfgen=<string> Can be used to pass additional command-line options to the ELF generation phase of the compiler, which is currently run as a make command to build all AI Engine ELF files.

For example, to limit the number of parallel compilations to four, you write -Xelfgen="-j4".

Note: If during compilation you see errors with bad_alloc in the log, or if the Vitis IDE crashes, this could be due to insufficient memory on your workstation. A possible workaround (other than increasing the available memory on your machine) is to limit the parallelism used by the compiler during code generation phase. This can be specified in the GUI as the compiler CodeGen option -j1 or -j2, or on the command line as -Xelfgen=-j1 or -Xelfgen=-j2.
--Xmapper=<string> Can be used to pass additional command-line options to the mapper phase of the compiler. For example:
--Xmapper=DisableFloorplanning

These are options to try when the design is either failing to converge in the mapping or routing phase, or when you are trying to achieve better performance via reduction in memory bank conflict.

See the Mapper and Router Options for a list and description of options.

--Xpreproc=<string> Pass general option to the PREPROCESSOR phase for all source code compilations (AIE/PS/PL/x86sim). For example:
--Xpreproc=-D<var>=<value>
--Xpslinker=<string> Pass general option to the PS LINKER phase. For example:
--Xpslinker=-L<libpath> -l<libname>
--Xrouter=<string> Pass general option to the ROUTER phase. For example:
-Xrouter=enableSplitAsBroadcast
--fast-floats Enable fast implementation for linear floating point scalar operations like add, sub, mul, and compare.
--fast-nonlinearfloats Enable fast implementation for non-linear floating point scalar operations like sine/cosine, sqrt, and inv.
Note: Only AI Engine kernels that have been modified are recompiled in subsequent compilations of the AI Engine graph. Any un-modified kernels will not be recompiled.
Table 9. Event Trace Options
Option Name Description
--event-trace=<value>

where <value> is one of the following:

  • functions
  • functions_partial_stalls
  • functions_all_stalls
  • runtime
Event trace configuration value. Where the specified <value> indicates the following:
  • Function transition view without stalls.
  • Function transition view with stream/lock/cascade stalls.
  • Function transition view with all stalls (stream/lock/cascade/memory).
  • Run-time event tracing configuration.
--event-trace-port=<value>
  • plio
  • gmio
Set the AI Engine event tracing port. The default value is plio; however, Xilinx recommends that you use gmio as the event-trace-port configuration. See Event Trace Build Flow for more information.
  • Set the AI Engine event tracing port to plio
  • Set the AI Engine event tracing port to gmio
--num-trace-streams=<int> Number of trace streams (default: 16).
--trace-plio-width=<int> PLIO width for trace streams (default: 64). Allowed values are 32 and 64.
Table 10. Optimization Options
Option Name Description
--xlopt=<int> Enable a combination of kernel optimizations based on the opt level (allowed values are 0 to 2, default is 1).
  • xlopt=1
    • Automatic computation of heap size: Enables ease of use using kernel analysis to automatically compute the heap requirements for each AI Engine. Therefore, you do not need to specify the heap size.
    • Guidance: Guidance is provided to highlight unaligned variables, global arrays that can potentially be mapper allocated, improper usage of restrict, and potential read before write conflicts.
  • xlopt=2
    • Automatic inline: Automatically inlining functions if it is practical and possible to do so, even if the functions are not declared as __inline or inline.
    • Loop peeling for unrolled loops: Make loop iteration count a multiple of the unrolling factor via peeling. Split a loop into multiple loops based on its iteration count and profitability heuristics, and add flattening pragma on the split loops.
    • Pragma insertion: Automatically infer and insert pragmas in kernel code.
      Note: Compiler optimization (xlopt > 0) reduces debug visibility.
--Xxloptstr=<string> Option string to enable/disable optimizations in xlopt level 2.
  • -xlinline-threshold=T: set the automatic inlining threshold to T (default T = 5000)
  • -annotate-pragma: automatic insertion of loop unrolling, pipelining, and flattening pragma (default = true)