AI Engine Array Interface

Versal Adaptive SoC AI Engine Architecture Manual (AM009)

Document ID
AM009
Release Date
2023-08-18
Revision
1.3 English

The AI Engine array interface consists of PL and NoC interface tiles. There is also one configuration interface tile per device. The following figure shows the array interface connectivity that the AI Engine array uses to communicate with other blocks in the Versal architecture. Also specified are the number of streams in the AXI4-Stream interconnect interfacing with the PL, NoC, or AI Engine tiles, and between the AXI4-Stream switches.

Tip: The exact number of PL and NoC interface tiles is device specific. The Versal Architecture and Product Data Sheet: Overview (DS950) lists the size of the AI Engine array.
Figure 1. AI Engine Array Interface Topology

Note: The AI Engine FMAX is 1 GHz for the -1L speed grade devices. The PL clock should be set at half that speed to 500 MHz. There is also a clock domain crossing at the NoC interface tile between the clocks for the AI Engine and the NoC.

The types of interfaces to the PL and NoC are:

  • Memory-mapped AXI4 interface: the communication channel is from the NSU to the AI Engine as a slave
  • AXI4-Stream interconnect has three types of interfaces:
    • Bi-directional connection to the PL streaming interface
    • Connection to the array interface DMA that generates traffic into the NoC using a memory-mapped AXI4 interface
    • Direct connection to the NoC streaming interfaces (NSU and NMU)

The AI Engine array interface tiles manage the two high performance interfaces:

  • AI Engine to PL
  • AI Engine to NoC

The following tables summarize the bandwidth performance of the AI Engine array interface with the PL, the NoC, and the AI Engine tile. The bandwidth performances are specified per each AI Engine column for the -1L speed grade devices. There is a reduction in the number of connections per column between the PL to AI Engine interface and the AXI4-Stream switch to the AI Engine tile. This is to support the horizontally connected stream switches that provide additional horizontal routing capability. The total bandwidth for the various devices across speed grades can be found in the Versal AI Core Series Data Sheet: DC and AC Switching Characteristics (DS957).

Table 1. AI Engine Array Interface to PL Interface Bandwidth Performance
Connection Type Number of Connections Data Width (bits) Clock Domain Bandwidth per Connection (GB/s) Aggregate Bandwidth (GB/s)
PL to AI Engine array interface 8 64 1 PL

(500 MHz)

4 32
AI Engine array interface to PL 6 64 PL

(500 MHz)

4 24
AI Engine array interface to AXI4-Stream switch 8 32 AI Engine

(1 GHz)

4 32
AXI4-Stream switch to AI Engine array interface 6 32 AI Engine

(1 GHz)

4 24
Horizontal interface between AXI4-Stream switches 2 4 32 AI Engine

(1 GHz)

4 16
  1. All streams to and from the PL are 64 bits on the PL side. The streams can be converted to 32-bit wide, but only one 32-bit word is valid out of the 64-bit wide stream. Two 64-bit wide streams can be combined to form a 128-bit wide stream, but the number of connections are halved.
  2. The aggregate bandwidth shown is in the east/west direction. There are two sets of connections going in and out of each AXI4-Stream switch.
Table 2. AI Engine Array Interface to NoC Interface Bandwidth Performance
Connection Type Number of Connections Data Width (bits) Clock Domain Bandwidth per Connection (GB/s) Aggregate Bandwidth (GB/s)
AI Engine to NoC (NoC side) 1 128 NoC Interface

(960 MHz) 1

16 16
AI Engine to NoC (AI Engine side) 4 32 AI Engine

(1 GHz)

4 16
NoC to AI Engine (NoC side) 1 128 NoC Interface

(960 MHz) 1

16 16
NoC to AI Engine (AI Engine side) 4 32 AI Engine

(1 GHz)

4 16
  1. The frequency is based off of a –1L speed grade.
Table 3. AI Engine Array Interface to AI Engine Tile Bandwidth Performance
Connection Type Number of Connections Data Width (bits) Clock Domain Bandwidth per Connection (GB/s) Aggregate Bandwidth (GB/s)
AXI4-Stream switch to AI Engine tile 6 32 AI Engine

(1 GHz)

4 24
AI Engine tile to AXI4-Stream switch 4 32 AI Engine

(1 GHz)

4 16

The following sections contain additional AI Engine array interface descriptions. The AI Engine tiles are described in the AI Engine Tile Architecture chapter.