Simulating the Application with the Emulation Flow - 2023.1 English

Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393)

Document ID
Release Date
2023.1 English

Development of a user application and hardware kernels targeting an FPGA requires a phased development approach. Because FPGA, AMD Versal™ adaptive SoC, and AMD Zynq™ UltraScale+™ MPSoC are programmable devices, building the device binary for hardware takes some time. To enable quicker iterations without having to go through the full hardware compilation flow, the AMD Vitis™ tool provides software emulation targets to perform C-based simulation of the design, and hardware emulation targets to perform C-RTL co-simulation of the software application and PL kernels. Compiling for emulation targets is significantly faster than compiling for the actual hardware. Additionally, emulation targets provide full visibility into the application or accelerator, thus making it easier to perform debugging. Once your design passes in emulation, then in the late stages of development you can compile and run the application on the hardware platform.

The Vitis tool provides two emulation targets:

Software emulation (sw_emu)
The software emulation build compiles and links quickly, and the host program runs either natively on an x86 processor or in the QEMU emulation environment. The PL kernels are natively compiled and running on the host machine. This build target lets you quickly iterate on both the host code and kernel logic.
Hardware emulation (hw_emu)
The host program runs in sw_emu, natively on x86 or in the QEMU, but the kernel code is compiled into an RTL behavioral model which is run in the AMD Vivado™ simulator or other supported third-party simulators. This build and run loop takes longer but provides a cycle-accurate view of kernel logic.

Compiling and linking for either of the emulation targets is seamlessly integrated into the Vitis command line and IDE flows. You can compile your host and kernel source code for either emulation target, without making any change to the source code. For your host code, you do not need to compile differently for emulation as the same host executable or PS application ELF binary can be used in emulation. Emulation targets support most of the features including XRT APIs, buffer transfer, platform memory SP tags, kernel-to-kernel connections, etc. The following sections detail the features and requirements of both the software and hardware emulation flows.

While running emulation you can specify a number of trace options as described in Enabling Profiling in Your Application to capture design data during runtime. Any reports generated during the run are collected into the xrt.run_summary file. This collection of reports can be viewed by opening the run_summary in Vitis analyzer, and includes a Summary report, System and Platform Diagrams to illustrate the hardware design, Run Guidance offering any suggestions for improving the performance of the system, and a Profile Summary and Timeline Trace when enabled in the xrt.ini file during runtime. Refer to Using the Vitis Analyzer for additional information.

SW Emulation is an abstract model and does not use any of the petalinux drivers like such as Zynq OpenCL (ZOCL), interrupt controller, or Device Tree Binary (DTB). Hence, the overhead of creating sd_card.img and booting petalinux on full QEMU machines can be avoided for SW Emulation. This enables faster SW_EMU as QEMU is slow and requires petalinux. Thus, for this approach the user is not required to provide fields such as sysroot, rootfs and sd_Card Image.

Note: If users are sourcing the environment setup script ''xilinx-versal-common-v2022.2/environment-setup-cortexa72-cortexa53-xilinx-linux'', they may find a warning or error to unset the LD_LIBRARY_PATH (if already set) in order to execute the embedded XRT. The environment setup script sets up the arm gcc tool chain path along with the required additional environment variables.

Installing the x86 XRT automatically sets the LD_LIBARY_PATH variable to point to XRT libraries. For running both the embedded XRT and x86 XRT on the same setup (terminal), you must specify arm-gcc and SYSROOT paths for embedded systems.

Limitations of the Software Emulation Flow

The following are not supported in sw_emu but supported by hw_emuand hw targets

hls_stream<ap_uint> and hls_stream<ap_int>
ap_uint and ap_int are supported as primitive datatypes, but not as an hls_stream datatype.
Array or Vector of hls_streams
If the HLS Kernel written is expecting to read and write the data to an array of hls_stream, sw_emu is not elaborating the array to N streams and processing it.
Runtime Data generation based on memory connectivity
HBM memory size for each bank is limited to 256MB; proper RTD generation based on memory connectivity is not supported.
HBM bus notation is not supported when host code is written in XRT Native APIs.
Kernels with RTL
Only HLS and AIE-1 Kernels are supported.
AIE-2 Ex buf descriptors
Not supported.
PL Controller based designs
Not supported.
Presynthi PDI is mandatory for sw_emu
PS QEMU designs will not work if base platforms does not have Presynth PDI.