Advanced Tab - 4.1 English

DPUCZDX8G for Zynq UltraScale+ MPSoCs Product Guide (PG338)

Document ID
PG338
Release Date
2023-01-23
Version
4.1 English

The following figure shows the Advanced tab of the DPUCZDX8G configuration.

Figure 1. DPUCZDX8G Configuration – Advanced Tab
S-AXI Clock Mode
s_axi_aclk is the S-AXI interface clock. When Common with M-AXI Clock is selected,
s_axi_aclk shares the same clock as m_axi_aclk and the s_axi_aclk port is hidden. When Independent is selected, a clock different from m_axi_aclk must be provided.
dpu_2x Clock Gating
dpu_2x clock gating is an option for reducing the power consumption of the DPUCZDX8G. When the option is enabled, a port named dpu_2x_clk_ce appears for each DPUCZDX8G core. The dpu_2x_clk_ce port should be connected to the clk_dsp_ce port in the dpu_clk_wiz IP. The dpu_2x_clk_ce signal can shut down the dpu_2x_clk when the computing engine in the DPUCZDX8G is idle. To generate the clk_dsp_ce port in the dpu_clk_wiz IP, the clocking wizard IP should be configured with specific options. For more information, see the Reference Clock Generation section.
DSP Cascade
The maximum length of the DSP48E slice cascade chain can be set. Longer cascade lengths typically use fewer logic resources but might have worse timing. Shorter cascade lengths might not be suitable for small devices as they require more hardware resources. Xilinx recommends selecting the mid-value, which is four, in the first iteration and adjust the value if the timing is not met.
DSP Usage
This allows you to select whether DSP48E slices will be used for accumulation in the DPUCZDX8G convolution module. When low DSP usage is selected, the DPUCZDX8G IP will use DSP slices only for multiplication in the convolution module. In high DSP usage mode, the DSP slice will be used for both multiplication and accumulation. Thus, the high DSP usage consumes more DSP slices and fewer LUTs. The additional logic resources required for High versus Low DSP usage is shown in the following table:
Table 1. Extra Resource Utilization of High DSP Compared to Low DSP Usage
DPUCZDX8G Architecture Extra LUTs Extra Registers Extra DSPs 1
B512 1418 1903 -32
B800 1445 2550 -40
B1024 1978 3457 -64
B1152 1661 2525 -48
B1600 2515 4652 -80
B2304 3069 4762 -96
B3136 3520 6219 -112
B4096 3900 7359 -128
  1. Negative numbers imply a relative decrease.
UltraRAM
There are two kinds of on-chip memory resources in Zynq® UltraScale+™ devices: block RAM and UltraRAM. The available amount of each memory type is device-dependent. Each block RAM consists of two 18K slices which can be configured as 9b*4096, 18b*2048, or 36b*1024. UltraRAM has a fixed-configuration of 72b*4096. A memory unit in the DPUCZDX8G has a width of ICP*8 bits and a depth of 2048. For the B1024 architecture, the ICP is 8, and the width of a memory unit is 8*8 bit. Each memory unit can then be instantiated with one UltraRAM block. When the ICP is greater than eight, each memory unit in the DPUCZDX8G needs at least two UltraRAM blocks.

The DPUCZDX8G uses block RAM as the memory unit by default. For a target device with both block RAM and UltraRAM, configure the number of UltraRAM to determine how many UltraRAMs are used to replace some block RAMs. The number of UltraRAM should be set as a multiple of the number of UltraRAM required for a memory unit in the DPUCZDX8G. An example of block RAM and UltraRAM usage is shown in the Summary tab section.

Timestamp
When enabled, the DPUCZDX8G records the time that the DPUCZDX8G project was synthesized. When disabled, the timestamp keeps the value at the moment of the last IP update.