This section provides an overview of the Versal ACAP hardware view components.
Key Hardware Components
The following list describes the largest hardware view components:
- AI Engine
- The AI Engine contains a
scalar unit, a vector unit, load units, and a memory interface. The scalar
unit contains a 32-bit scalar RISC processor with register files for general
purpose, pointer, configuration, and backup registers, and a 32x32-bit
scalar multiplier. The AI Engine also
supports non-linear functions including sine/cosine, squareroot, and
inverse-squareroot. Three address generator units (AGUs) are available: two
dedicated as load units, and one dedicated as a store unit. The vector unit
contains a 512-bit vector fixed-point / integer unit. Devices with AI Engines contain a single-precision
floating point vector unit. Devices with an AIE-ML contain a fixed-point
vector unit also used for Bfloat16 and FP32 support. The vector units in
both the AI Engine and AIE-ML support
concurrent operation on multiple vector lanes.
Within each AI Engine is a dedicated, single-port, 16 KB program memory 128-bit wide and 1k deep. The program memory supports instruction compression and has ECC protection and reporting.
- The application processing unit (APU) consists of Cortex-A72 processor cores, L1/L2 caches, and
related functionality. The Cortex-A72
cores and caches are part of Arm MPCore
Versal ACAP uses a dual-core Cortex-A72 processor system with 1 MB L2 cache. The Cortex-A72 cores implement Armv8 64-bit architecture. The Cortex-A72 MPCore does not have integrated generic interrupt controller (GIC), so an external GIC IP is used. For more information, refer to "APU Processor Features" in Versal ACAP Technical Reference Manual (AM011).
- AXI Interconnect
- The advanced eXtensible interface (AXI) interconnect connects one or more memory mapped AXI master devices to one or more memory mapped peripheral devices. The AXI interfaces conform to the AMBA® AXI version 4 specifications from Arm, including the AXI4-Lite control register interface subset.
- The interconnect for cache coherent interconnect for accelerators (CCIX) and PCIe® (CPM) module is the primary PCIe interface for the processing system. There are two integrated blocks for PCIe in the CPM, supporting up to Gen4 x16. You can configure both of the integrated blocks for PCIe as an endpoint. Furthermore, you can configure each integrated block as a root port that contains direct memory access (DMA) controller. The CPM CCIX functionality allows a PL accelerator to act as a CCIX compliant accelerator.
- The programmable logic (PL) is a scalable structure that includes adaptable engines and intelligent engines that can be used to construct accelerators, processors, or almost any other complex functionality. It is configured using the Vivado® tools. The architect determines the components to be available in the PL design. For example, the MicroBlaze processor is an IP core, so you can optionally add MicroBlaze processors to the design. For more information on the PL, see MicroBlaze Processor Reference Guide (UG984).
- The platform management controller (PMC) handles device management control functions such as device reset sequencing, initialization, boot, configuration, security, power management, dynamic function eXchange (DFX), health-monitoring, and error management. You can boot the device in either secure or non-secure mode. For more information, refer to "Platform Management Controller" in Versal ACAP Technical Reference Manual (AM011).
- NoC Interconnect
- The NoC is the main interconnect and contains a vertical
component (VNoC) and a horizontal component (HNoC).
- HNoC is integrated in the horizontal super row/region (HSR). The HSR includes blocks such as XPIO, hard DDR memory controller, PLL, HBM, and AI Engine.
- VNoC integration includes the global-clk-column. In SSI technology, VNoCs are connected across super logic region (SLR) boundaries. Microbumps and buffers for this reside in the Thin-HNoC. Configuration data between SSI technology master and slaves travels over the NoC.
- The real-time processing unit (RPU) is a dual-core Cortex-R5F processor, based on the Armv7-R architecture with a floating point unit, which can run as either two independent cores or in a lock-step configuration. For more information, refer to Platform Management in Versal ACAP Technical Reference Manual (AM011).
- System Memory Management Unit
- The system memory management unit (SMMU) supports memory
virtualization for peripherals. The main functions of the SMMU include
logical memory protection by performing address translation, transaction
security state control, as well as blocking peripherals if configured to do
These functions are performed with a combination of the seven translation buffer units (TBU 0 to 6). Four of these are in the path of incoming AXI interfaces outside of the FPD to the CCI. The translation and protection tables that are cached in the TBU are updated by the SMMU translation control unit (TCU).
For more information on the SMMU, see Chapter 43 in the Versal ACAP Technical Reference Manual (AM011).
- Cache Coherent Interconnect
- The cache coherent interconnect (CCI) is based on the Arm
CCI-500 with its snoop filter (SF) table feature. It provides tight memory
coherency between the APU L2 cache and a PL system cache using the ACE
interface protocol to support multiple heterogeneous processing
environments. It is part of the FPD interconnect.
For more information on the CCI, see Chapter 44 in the Versal ACAP Technical Reference Manual (AM011).
Additional Hardware Components
- Peripheral Controllers
- The Input/Output peripherals are present in low power domain
(LPD) and PMC domain (PPD). The flash memory controllers (FMC) are located
in PMC. Their I/O signals are routed to device pins via the PMC MIO
For more information, refer to the I/O Peripherals and FMC sections in Versal ACAP Technical Reference Manual (AM011).
- Interconnects and Buses
Versal ACAP has
following additional interconnects and buses:
- The NoC programming interface, a 32-bit
programming interface to the NoC and several attached units.
For more information, refer to Versal ACAP Programmable Network on Chip and Integrated Memory Controller LogiCORE IP Product Guide (PG313).
- The advanced peripheral bus (APB) is a 32-bit
single-word read/write programming interface. This bus is used
to access control registers in the functional units, i.e.,
subsystem units. These control registers are used to program the
functional units. The APB switch is used as the interconnect
switch in the following four areas:
- The configuration frame interface (CFI) transports PL and integrated hardware configuration information contained in the boot image from the PMC to its destination within the Versal device. CFI provides a dedicated high-bandwidth 128-bit bus to PL for configuration and readback. For more information, refer to the Programming Interfaces chapter in Versal ACAP Technical Reference Manual (AM011).
- System Watchdog Timer
- The system watchdog (SWDT) timer is used to detect and recover from various malfunctions. The watchdog timer can be used to prevent system lockup (when the software becomes trapped in a deadlock). For more information, refer to "System Watchdog Timer" in Versal ACAP Technical Reference Manual (AM011).
Versal ACAP has the
- PMC and PS clocks
- CPM clocks
- NoC, AI Engine, and DDR memory controller clocks
- PL clocks: The PL includes its own clock arrays that are programmed when blocks are instantiated. The PL also includes programmable clock modules that can be driven by clocks from input pins and other sources.
Versal device has
following list of memories:
- DDR memory
- Up to 4096 GB of RAM is supported. This DDR memory is external to the device.
- On-chip memory (OCM) in the PS
- This memory is 256 KB in size, and is accessible to the RPU and APU processors via the LPD OCM interconnect switch.
- Accelerator RAM
- The 4 MB accelerator RAM (XRAM) is available in
AI Core series. The XRAM is
divided into four separate memory banks with four system
interfaces: an AXI port from the LPD PS and three PL AXI
The XRAM supports simultaneous access by each port to its associated bank. It also allows full cross-bank access from any port to any bank. For details please refer to XRAM Memory chapter in the Versal ACAP Technical Reference Manual (AM011).
- Tightly coupled memory (TCM) in the RPU
- This memory is 256 KB and is mainly used by the RPU but can be accessed by the APU.
- Battery-backed RAM (BBRAM)
- This memory can store the advanced encryption standard (AES) 256-bit key.
- Contains user memory to store multiple keys and security configuration settings.
Versal ACAP has
several layers of resets with overlapping effects. The highest-level resets
are generally aligned with power domains, then power island resets, and
finally the individual functional unit resets. In some cases, functional
units have local resets that affects part of the block. The reset
- Subsystem resets (power domains)
- Power-island resets
- Functional unit (block) resets
- Partial resets of a block (some cases)
For more information, refer to the "Resets" chapter in Versal ACAP Technical Reference Manual (AM011).
- The Versal device
includes the following hardware components for virtualization:
- CPU virtualization
- Memory virtualization
For more information, refer to "Memory Virtualization" in Versal ACAP Technical Reference Manual (AM011).
- Security and Safety
- The Versal device has the following
security management and safety features:
- Secure key storage and management
- Tamper monitoring and response
- User access to Xilinx hardware cryptographic accelerators
- Xilinx memory protection unit (XMPU) and Xilinx peripheral protection unit (XPPU) provides hardware-enforced isolation.
For more information, refer to "Platform Management Controller" in Versal ACAP Technical Reference Manual (AM011), Security, and Versal ACAP Security Manual (UG1508). This manual requires an active NDA to download from the Design Security Lounge.
For XMPU and XPPU, refer to "Memory Protection" in Versal ACAP Technical Reference Manual (AM011).