AIE-ML Array Overview

Versal ACAP AIE-ML Architecture Manual (AM020)

Document ID
Release Date
1.0 English

The following figure shows the high-level block diagram of a Versal adaptive compute acceleration platforms (ACAP) with an AIE-ML array in it. The device consists of the processor system (PS), programmable logic (PL), and the AIE-ML array.

Figure 1. Versal Device (with AIE-ML) Top-Level Block Diagram

The AIE-ML array is the top-level hierarchy of the AIE-ML architecture. It integrates a two-dimensional array of AIE-ML tiles. Each AIE-ML tile integrates a very-long instruction word (VLIW) processor, integrated memory, and interconnects for streaming, configuration, and debug. The AIE-ML array introduced a separate functional block, the memory tile, that is used to significantly reduce PL resources (LUTs and URAMs) for ML applications. The memory tile has 512 KB data memory, 12 DMA channels (eight can access neighboring memory tiles) and stream interfaces. Depending on the device there can be one or two rows of memory tiles. The AIE-ML array interface enables the AIE-ML array to communicate with the rest of the Versal device through the NoC or directly to the PL. The AIE-ML array also interfaces to the processing system (PS) and platform management controller (PMC) through the NoC.

VersalĀ® ACAP devices that integrate AIE-ML tiles have access to the following types of memory:

  • External DDR memory (via NoC)
  • On-chip PL memory resources (URAM/block RAM)
  • On-chip shared memory in AIE-ML memory tiles
  • On-chip local data memory in AIE-ML tiles

Depending on the use case, the data and weights move through the memory hierarchy in different ways.