fpmul_conf, fpmac_conf - 2023.2 English

Vitis Tutorials: AI Engine (XD100)

Document ID
XD100
Release Date
2024-03-05
Version
2023.2 English

These functions are fully configurable fpmul and fpmac functions. The output can be considered to always have eight values because each part of the complex float is treated differently. A vector<cfloat,4> will have the loop interating over real0 - complex0 - real1 - complex1 … This capability is introduced to allow flexibility and implement operations on conjugates.

vector<float,8> fpmac_conf(vector<float,8> acc, vector<float,32> xbuf, int xstart, unsigned int xoffs, vector<float,8> zbuf, int zstart, unsigned int zoffs, bool ones, bool abs, unsigned int addmode, unsigned int addmask, unsigned int cmpmode, unsigned int & cmp)

Returns the multiplication result.

Parameter Comment
acc Current accumulator value. This parameter does not exist for fpmul_conf.
xbuf First multiplication input buffer.
xstart Starting offset for all lanes of X.
xoffs 4 bits per lane: Additional lane-dependent offset for X.
zbuf Optional Second multiplication input buffer. If zbuf is not specified, xbuf is taken as the second buffer.
zstart Starting offset for all lanes of Z. This must be a compile time constant.
zoffs 4 bits per lane: Additional lane-dependent offset for Z.
ones If true, all lanes from Z are replaced with 1.0.
abs If true, the absolute value is taken before accumulation.
addmode Select one of the fpadd_add (all add), fpadd_sub (all sub), fpadd_mixadd or fpadd_mixsub (add-sub or sub-add pairs). This must be a compile time constant.
addmask 8 x 1 LSB bits: Corresponding lane is negated if bit is set (depending on addmode).
cmpmode Use fpcmp_lt to select the minimum between accumulator and result of multiplication per lane, fpcmp_ge for the maximum and fpcmp_nrm for the usual sum.
cmp Optional 8 x 1 LSB bits: When using fpcmp_ge or fpcmp_lt in "cmpmode", it sets a bit if accumulator was chosen (per lane).