AIE-ML Tile Architecture

Versal ACAP AIE-ML Architecture Manual (AM020)

Document ID
Release Date
1.0 English

The top-level block diagram of the AIE-ML tile architecture, key building blocks, and connectivity for the AIE-ML tile are shown in the following figure.

Figure 1. AIE-ML Tile Block Diagram

The AIE-ML tile consists of the following high-level modules:

  • Tile interconnect
  • AIE-ML
  • AIE-ML memory module

The tile interconnect module handles AXI4-Stream and memory mapped AXI4 input/output traffic. The memory-mapped AXI4 and AXI4-Stream interconnect is further described in the following sections. The AIE-ML memory module has 64 KB of data memory divided into eight memory banks, a memory interface, DMA, and locks. There is a DMA in both incoming and outgoing directions and there is a Locks block within each memory module. The AIE-ML can access memory modules in all four directions as one contiguous block of memory. The memory interface maps memory accesses in the right direction based on the address generated from the AIE-ML. The AIE-ML has a scalar datapath, a vector datapath, three address generators, and 16 KB of program memory. It also has a cascade stream access for forwarding accumulator output to the next AIE-ML tile. The AIE-ML is described in more detail in AIE-ML Architecture. Both the AIE-ML and the AIE-ML memory module have control, debug, and trace units. Some of these units are described later in this chapter:

  • Control and status registers
  • Events, event broadcast, and event actions
  • Performance counters for profiling and timers

The following figure shows the AIE-ML array with the AIE-ML tiles and the dedicated interconnect units arrayed together. Sharing data with local memory between neighboring AIE-MLs is the main mechanism for data movement within the AIE-ML array. Each AIE-ML can access up to four memory modules:

  • Its own
  • The module on the north
  • The module on the south
  • The module on the west

The AIE-ML on the edges of the array have access to one or two fewer memory modules.

Figure 2. AIE-ML Array

Together with the flexible and dedicated interconnects, the AIE-ML array provides deterministic performance, low latency, and high bandwidth. The modular and scalable architecture allows more compute power as more tiles are added to the array.

The AIE-ML has both horizontal and vertical cascade connections, directed from north to south and from west to east. The cascade start points and end points are tied off at the array edges.