Register Files

Versal ACAP AIE-ML Architecture Manual (AM020)

Document ID
Release Date
1.0 English

The AIE-ML has several types of registers. Some of the registers are used in different functional units. This section describes the various types of registers.

Scalar Registers

Scalar registers include configuration registers. See the following table for register descriptions.

Table 1. Scalar Registers
Syntax Number of bits Description
r0..r31 32 bits General-purpose registers
m0..m7 20 bits Modifier registers
p0..p7 20 bits Pointer registers

Special Registers

Table 2. Special Registers
Syntax Number of bits Description
dn0..dn7 20 bits AGU dimension size register
dj0..dj7 20 bits AGU dimension stride (jump) register
dc0..dc7 20 bits AGU dimension count register
s0..s3 6 bits Shift control
sp 20 bits Stack pointer
lr 20 bits Link register
pc 20 bits Program counter
fc 20 bits Fetch counter
  32 bits Status register 1
  32 bits Mode control register 1
ls 20 bits Loop start
le 20 bits Loop end
lc 32 bits Loop count
lci 32 bits Loop count (PCU)
S 8 bits Shift control
  1. The status and control registers are each separate registers of a small number of bits each. Only through the debug interface are the various individual registers accessed together as one 32-bit wide SR and CR register.

Vector Registers

Vector registers are wide to allow SIMD instructions and to be used as operand storage. These registers are prefixed with a W. There are 24 x 256-bit registers: wln and whn, n-0..11. Two W registers can be grouped to form a 512-bit register prefixed with an X. Two X registers then can be grouped to form a 1024-bit register with the prefix Y and Y2 … Y5 are aliased for X4 … X11.

Table 3. Vector Registers
256-bit 512-bit 1024-bit
wl0 x0  
wl1 x1
wl2 x2  
wl3 x3  
wl4 x4 y2
wl5 x5
wl6 x6 y3
wl7 x7
wl8 x8 y4
wl9 x9
wl10 x10 y5
wl11 x11

Mask Registers

In addition to the vector registers, there are 4 x 128-bit mask registers (Q0 to Q3) used for sparsity.

Accumulator Registers

Accumulator registers are used to store the results of the vector data path. 256 bit wide, they can be viewed as eight lanes of 32-bit data or four lanes of 64-bit data. The accumulator registers are prefixed with am. Two of them are aliased to form a 512-bit register prefixed with bm, and two bm can be aliased to form a 1024-bit register prefixed with cm.
Table 4. Accumulator Registers
256-bit 512-bit 1024-bit
amll0 bml0 cm0
amhl1 bmh0
... ... ...
... ...
amll8 bml8 cm8
amhl8 bmh8