APU - 2020.2 English

Versal ACAP Design Guide (UG1273)

Document ID
UG1273
Release Date
2021-03-26
Version
2020.2 English

The application processing unit (APU) includes a dual-core Arm® Cortex®-A72 processor attached to a 1 MB unified L2 cache. The APU is designed for system control and compute-intensive applications that do not need real-time performance. The increased performance of Versal ACAP requires higher performance from the memory subsystem. To help meet these requirements, the Versal ACAP includes an increased L1 instruction cache size (32 KB to 48 KB) as well as multiple DDRMCs and the NoC, which improve the performance of the main memory.

The following table shows the difference between the Cortex-A53 in Zynq® UltraScale+™ MPSoCs and the Cortex-A72 processors in Versal ACAPs.

Table 1. Cortex-A53 and Cortex-A72 Comparison
Cortex-A53 Cortex-A72 Versal ACAP Benefits
Armv8A architecture (64-bit and 32-bit operations) No application code changes required
EL0-EL3 exception levels
Secure/non-secure operation
Advanced SIMD NEON floating-point unit
Integrated memory manager
Power island control
Up to 1500 MHz Up to 1700 MHz Higher frequency
2.23 DMIPS per MHz 5.74 DMIPS per MHz 2 times higher raw performance (per Arm benchmarks)
3.65 SPEC2006int 6.84 SPEC2006int
2-way super scalar 3-way super scalar More efficient instruction cycle
In-order execution Out-of-order execution Higher performance and fewer memory stalls
Power efficient Improved power efficiency 20% lower power
8-stage pipeline 15-stage pipeline More instructions queued and executed
Conditional branch prediction Two-level branch prediction Higher cache hits and less memory fetches