Using Hardware Emulation to Analyze System Performance - 2021.2 English

Versal ACAP System Integration and Validation Methodology Guide (UG1388)

Document ID
Release Date
2021.2 English

Versal ACAP systems that include AI Engines, PS, and PL can be integrated using the Vitis linker and then simulated together using Vitis hardware emulation. Hardware emulation allows you to observe and measure the combined effects of AI Engines, PS, and PL interactions on system performance. In hardware emulation, the PL kernels run as RTL, the AI Engine kernels run in the aiesimulator, and the PS code runs in the Xilinx Quick Emulator (QEMU). Some infrastructure blocks are abstracted using transaction level models (TLM) for simulation speed purposes. Hardware emulation is nearly but not fully cycle accurate and provides a valuable representation to analyze, debug, and validate major system performance considerations prior to implementation.

Hardware emulation runs automatically and generates various performance-related reports based on user settings, such as the profile summary and the application timeline trace. You can view these reports in the Vitis Analyzer for useful insights into performance, such as data transfer size and efficiency, kernel run times, stall information, and more. In addition to these reports, Vitis Analyzer provides detailed activity waveforms, allowing you to conduct custom, fine-grained analysis of specific parts of the system.

For more information on hardware emulation and Vitis Analyzer, see the Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393) and Versal ACAP AI Engine Programming Environment User Guide (UG1076).

Additional Hardware

For a synthesizable way of measuring throughput, you can design RTL IP to count cycles elapsed for a particular transaction (e.g., time taken to transmit a payload of "B" beats of data from source to destination). This can be one by triggering a counter to start events from first TVALID to TLAST. Alternatively, you can leverage AXI Performance Monitor (APM) IP provided in the Vivado IP Catalog to count the events.


A lightweight PS application must be developed to run the entire system (e.g., trigger PL IP, start traffic, etc.), which is application dependent. For example, some applications might utilize on PL reset deassertion and do not require any PS application. In this case, a simple PS application to print Hello World can be used to trigger the system emulation.

Baremetal applications are sufficient. Linux based hardware emulation is also possible but does not provide additional benefits at this stage.