The following figure shows the Advanced tab of the DPU configuration.
Figure 1.
DPU Configuration – Advanced
Tab
- S-AXI Clock Mode
-
s_axi_aclk
is the S-AXI interface clock. When Common with M-AXI Clock is selected,s_axi_aclk
shares the same clock asm_axi_aclk
and thes_axi_aclk
port is hidden. When Independent is selected, a clock different fromm_axi_aclk
must be provided. - dpu_2x Clock Gating
-
dpu_2x
clock gating is an option for reducing the power consumption of the DPU. When the option is enabled, a port nameddpu_2x_clk_ce
appears for each DPU core. Thedpu_2x_clk_ce
port should be connected to theclk_dsp_ce
port in the dpu_clk_wiz IP. Thedpu_2x_clk_ce
signal can shut down thedpu_2x_clk
when the computing engine in the DPU is idle. To generate theclk_dsp_ce
port in thedpu_clk_wiz
IP, the clocking wizard IP should be configured with specific options. For more information, see the Reference Clock Generation section. Note thatdpu_2x
clock gating is not supported in Zynq®-7000 devices. - DSP Cascade
- The maximum length of the DSP48E slice cascade chain can be set. Longer cascade lengths typically use fewer logic resources but might have worse timing. Shorter cascade lengths might not be suitable for small devices as they require more hardware resources. Xilinx recommends selecting the mid-value, which is four, in the first iteration and adjust the value if the timing is not met.
- DSP Usage
- This allows you to select whether DSP48E slices will be used for
accumulation in the DPU convolution module. When
low DSP usage is selected, the DPU IP will use
DSP slices only for multiplication in the convolution. In high DSP usage mode, the DSP slice
will be used for both multiplication and accumulation. Thus, the high DSP usage consumes more
DSP slices and less LUTs. The logic utilization for high and low DSP usage is shown in the
following table. The data is based on the DPU in
the Xilinx ZCU102 platform without Depthwise Convolution,
Average Pooling, Channel Augmentation, and Leaky ReLU features.Note: DSP Cascade is not supported in Zynq-7000 devices and it is locked to 1.
Table 1. Resources for Different DSP Usage High DSP Usage Low DSP Usage Arch LUT Register BRAM DSP Arch LUT Register BRAM DSP B512 20055 28849 69.5 98 B512 21171 33572 69.5 66 B800 21490 34561 87 142 B800 22900 33752 87 102 B1024 24349 46241 101.5 194 B1024 26341 49823 101.5 130 B1152 23527 46906 117.5 194 B1152 25250 49588 117.5 146 B1600 26728 56267 123 282 B1600 29270 60739 123 202 B2304 39562 67481 161.5 386 B2304 32684 72850 161.5 290 B3136 32190 79867 203.5 506 B3136 35797 86132 203.5 394 B4096 37266 92630 249.5 642 B4096 41412 99791 249.5 514 - UltraRAM
- There are two kinds of on-chip memory resources in
Zynq®
UltraScale+™
devices: block RAM and UltraRAM. The
available amount of each memory type is device-dependent. Each block RAM block consists of two
block RAM 18K slices which can be configured as 9b*4096, 18b*2048, or 36b*1024. UltraRAM has a
fixed-configuration of 72b*4096. A memory unit in the DPU has a width of ICP*8 bits and a depth of 2048. For the B1024
architecture, the ICP is 8, and the width of a memory unit is 8*8 bit. Each memory unit can
then be instantiated with one UltraRAM block. When the ICP is greater than 8, each memory unit
in the DPU needs at least two UltraRAM blocks.
The DPU uses block RAM as the memory unit by default. For a target device with both block RAM and UltraRAM, configure the number of UltraRAM to determine how many UltraRAMs are used to replace some block RAMs. The number of UltraRAM should be set as a multiple of the number of UltraRAM required for a memory unit in the DPU. An example of block RAM and UltraRAM utilization is shown in the Summary tab section.
- Timestamp
- When enabled, the DPU records
the time that the DPU project was synthesized.
When disabled, the timestamp keeps the value at the moment of the last IP update. The timestamp
information can be obtained using the
Vitis™ AI tools.Note: Most of the DPU configuration settings can be accessed by the Vitis AI tools. The following figure shows the information read by the Vitis AI tools.Figure 2. Timestamp Example