fpmul - 2023.2 English

Vitis Tutorials: AI Engine (XD100)

Document ID
XD100
Release Date
2024-03-05
Version
2023.2 English

The simple floating-point multiplier comes in many different flavors mixing or not float and cfloat vector data types. When two cfloat are involved, the intrinsic results in two microcode instructions that must be scheduled. The first buffer can be either 512 or 1024-bit long (vector<float,32>, vector<float,16>, vector<cfloat,16>, vector<cfloat,8>), the second buffer is always 256-bit long (vector<float,8>, vector<cfloat,4>). Any combination is allowed.

vector<float,8> fpmul(vector<float,32> xbuf, int xstart, unsigned int xoffs, vector<float,8> zbuf, int zstart, unsigned int zoffs)

Returns the multiplication result.

Parameter Comment
xbuf First multiplication input buffer.
xstart Starting offset for all lanes of X.
xoffs 4 bits per lane, additional lane-dependent offset for X.
zbuf Second multiplication input buffer.
zstart Starting offset for all lanes of Z. This must be a compile time constant.
zoffs 4 bits per lane, additional lane-dependent offset for Z.
for (i = 0 ; i < 8 ; i++)
  ret[i] =  xbuf[xstart + xoffs[i]] * zbuf[zstart + zoffs[i]]