The register move capabilities of the AIE-ML are covered in this section (refer to the Register Files section for a description of the naming of register types.

- Scalar to scalar:
- Move scalar values between R, M, P, and special registers.
- Move immediate values to R, M, P, and special registers.
- Move a scalar value to/from an AXI4-Stream.

- Vector to vector: Move one 128-bit V-register to an arbitrary V-register in one cycle. It also applies to the 256-bit W-register and the 512-bit X-register. However, vector sizes must be the same in all cases.
- Accumulator to accumulator: Move one 512-bit accumulator (AM) register to another AM-register in one cycle. There is also register BM to BM accumulator register move (1024 bits).
- Vector to accumulator: there are three possibilities:
- Up shift path takes 16 or 32-bit vector values and writes into an accumulator.
- Use the normal multiplication datapath and multiply each value by a constant value of 1.
- Move between BM and X registers.

- Accumulator to vector: Shift-round saturate datapath moves the accumulator to a vector register. There is also a direct register move from accumulator to vector register.
- Accumulator to cascade stream and cascade to accumulator: Cascade stream connects the AIE-MLs in the array in a chain and allows the AIE-MLs to transfer an accumulator register (512-bit) from one to the next. A small two-deep 512-bit wide FIFO on both the input and output streams allows storing up to four values in the FIFOs between the AIE-MLs.
- Scalar to vector: Moves a scalar value from an R-register to a vector register. Different from AI Engine where most operations were on the 128-bit granularity except for shift element operation, only operations on 512-bit registers are allowed in AIE-MLs.
- Vector to scalar: Extracts an arbitrary 8, 16, or 32-bit value from a 512-bit vector register and writes results into a scalar R-register.