DPUCZDX8G for Zynq UltraScale+ MPSoCs Product Guide (PG338)

This option enables a hardware implementation of the softmax operator. The hardware softmax accelerator is packaged inside the DPU IP wrapper, but is a separate accelerator with has its own interface and runtime and implements an int8 input and floating-point output data formats. The hardware implementation of softmax can be up to 160 times faster than a software implementation on MPSoC devices. Users can enable this option if their networks/models include a softmax layer and they wish to improve throughput.
Note: The hardware softmax can support up to 1023 classes. If the number of classes is greater than 1023, a software implementation of softmax may be considered. For more information, refer to the Vitis AI Library User Guide (UG1354) .

When hardware softmax is enabled, an AXI master interface named SFM_M_AXI and an interrupt port named sfm_interrupt are added to the DPU IP wrapper. The softmax module uses m_axi_dpu_aclk as the source clock for SFM_M_AXI as well as for computation.

The additional resources utilized when hardware softmax acceleration is enabled are listed in the following table.

Table 1. Extra Resources with Softmax
IP Name Extra LUTs Extra FFs Extra BRAMs Extra DSPs
Softmax 9580 8019 4 14