Differences from Previous Generations

Versal ACAP DSP Engine Architecture Manual (AM004)

Document ID
AM004
Release Date
2022-09-11
Revision
1.2.1 English

DSP58 is the sixth version of the Xilinx DSP. It is fully backwards compatible with the UltraScale™ architecture DSP48E2. DSP58 is a superset of the DSP48E2. In addition, Versal® architecture DSP supports floating point operations and logic that interfaces with two back-to-back DSP58s to pair them as a tile-based 18-bit complex multiplier.

DSP58 INT8 Vector Dot Product Mode

  • The INT8 multiplier mode is used to implement the dot product unit where the multiplier can be split into three smaller multipliers and their products are summed up to feed the post-adder. Each output of the smaller multipliers can be negated.

DSP58

  • 27 × 24 multiplier:
    • B operand is increased from 18-bit to 24-bit.
  • 58-bit logic unit:
    • C operand is increased from 48-bit to 58-bit.
  • 116-bit wide XOR function (increased from 96-bit):
    • Wide XOR selectable for XOR12, XOR22 (new), XOR24, XOR34 (new), XOR58 (new), and XOR116 (new).
      Note: XOR48 and XOR96 are supported when migrating from the UltraScale architecture.
  • The A input is a 34-bit bus. The lower 27 bits feed the A input of the multiplier and the entire 34-bit input forms the upper 34 bits of the 58-bit A:B concatenate internal bus.
  • The built-in right-shift becomes 23 bits wide.
    Note: The 17-bit right-shift is supported when migrating from the UltraScale architecture.
  • Multiplier output (X and Y together) sign can be changed by the negate pins.

DSPFP32 Mode

  • Single precision floating-point multiplier and adder to produce both floating-point product and sum.
    • Multiplier:
      • Input can be either FP32 or FP16 and the output is always FP32.
    • Adder:
      • The input and output are both in FP32 only.
Note: FP32 is single precision floating-point number and FP16 is half precision floating-point number.

DSPCPLX Mode

  • Two back-to-back DSP58s in the same tile can be used together to implement 18 × 18 complex multiply and accumulate.