The application processing unit (APU) includes a dual-core Arm® Cortex®-A72 processor attached to a 1 MB unified L2 cache. The APU is designed for system control and compute-intensive applications that do not need real-time performance. The increased performance of Versal ACAP requires higher performance from the memory subsystem. To help meet these requirements, the Versal ACAP includes an increased L1 instruction cache size (32 KB to 48 KB) as well as multiple DDRMCs and the NoC, which improve the performance of the main memory.
The following table shows the difference between the Cortex-A53 in Zynq® UltraScale+™ MPSoCs and the Cortex-A72 processors in Versal ACAPs.
|Cortex-A53||Cortex-A72||Versal ACAP Benefits|
|Armv8A architecture (64-bit and 32-bit operations)||No application code changes required|
|EL0-EL3 exception levels|
|Advanced SIMD NEON floating-point unit|
|Integrated memory manager|
|Power island control|
|Up to 1500 MHz||Up to 1700 MHz||Higher frequency|
|2.23 DMIPS per MHz||5.74 DMIPS per MHz||2 times higher raw performance (per Arm benchmarks)|
|3.65 SPEC2006int||6.84 SPEC2006int|
|2-way super scalar||3-way super scalar||More efficient instruction cycle|
|In-order execution||Out-of-order execution||Higher performance and fewer memory stalls|
|Power efficient||Improved power efficiency||20% lower power|
|8-stage pipeline||15-stage pipeline||More instructions queued and executed|
|Conditional branch prediction||Two-level branch prediction||Higher cache hits and less memory fetches|