Configuration Options - 1.3 English

DPUCVDX8G for Versal ACAPs Product Guide (PG389)

Document ID
PG389
Release Date
2023-01-23
Version
1.3 English

The DPU can be configured with some predefined options, which includes the DPUCVDX8G architecture, the batch number, the number of compute units, and UltraRAM usage. These options allow you to set the DSP slice, LUT utilization, and block RAM and UltraRAM usage.

CPB_N

The CPB_N parameter represents the number of AI Engines used in one batch and determines the peak performance of the DPUCVDX8G. CPB_N can have a value of 32 or 64. For example, when CPB_N is 32, each batch handler in the DPUCVDX8G uses 32 AI Engines.

BATCH_N

The BATCH_N parameter determines the number of batch handlers integrated in the DPUCVDX8G IP. This parameter supports a range of values from 1 to 6 for C32, and 1 to 5 for C64. A higher batch handler number denotes better performance using more AI Engines and PL resources, but also higher DDR memory I/O bandwidth requirements. You can balance the performance, DDR memory I/O, and resources according to your application.

CU_N

The CU_N parameter determines the number of compute units. This parameter supports a range of values from 1 to 3 for C32B1 and C64B1 only.

UBANK_IMG_N

There are two kinds of on-chip memory resources in Versal devices: block RAM and UltraRAM. Each block RAM has a capacity of 36 Kb and each UltraRAM has a capacity of 288 Kb. The number of available RAMs is device-dependent.

There are 16 IMG BANKs (128 KB per bank) in each DPUCVDX8G batch handler. Each IMG BANK can be composed of block RAM or UltraRAM. The parameter UBANK_IMG_N determines how many IMG BANKs are composed of UltraRAM. The remaining banks will be composed of block RAM. This parameter is designed to flexibly use the on-chip memory resources.

UBANK_WGT_N

There are 17 WGT BANKs (256 KB per bank) in the DPUCVDX8G irrespective of the number of batch handlers. Each WGT BANK can be composed of block RAM or UltraRAM. The parameter UBANK_WGT_N determines how many WGT BANKs are composed of UltraRAM. The remaining banks will be composed of block RAM. This parameter is designed to flexibly use the on-chip memory resources.

UBANK_BIAS

There are two BIAS BANKs (32 KB per bank) in DPUCVDX8G irrespective of the number of batch handlers. Each BIAS BANK can be composed of block RAM or UltraRAM. The parameter UBANK_BIAS determines whether BIAS BANKs are composed of UltraRAM. This parameter is designed to flexibly use the on-chip memory resources.

Table 1. Comparison of RAM Utilization between Max-URAM and Max-BRAM on VC1902
Architecture Max. URAM Max. BRAM
CxxB1CU1
  • URAM:204
  • BRAM:0
  • URAM:76
  • BRAM:960
CxxB2CU1
  • URAM:268
  • BRAM:0
  • URAM:140
  • BRAM:960
CxxB3CU1
  • URAM:332
  • BRAM:0
  • URAM:204
  • BRAM:960
CxxB4CU1
  • URAM:396
  • BRAM:0
  • URAM:268
  • BRAM:960
CxxB5CU1
  • URAM:460
  • BRAM:0
  • URAM:332
  • BRAM:960
CxxB6CU1
  • URAM:411
  • BRAM:644
  • URAM: N/A
  • BRAM: N/A

LOAD_PARALLEL_IMG

The LOAD_PARALLEL_IMG indicates the level of parallelism of loading images for each DPUCVDX8G batch handler. Each parallelism uses one AXI4 interface for data transmission. Hence, the number of M_IMG_AXI ports of the DPUCVDX8G depends on the LOAD_PARALLEL_IMG. In this release, the supported value for this parameter is two. A higher parallelism means a larger throughput for loading an image and a larger bandwidth requirement, and therefore a higher PL resource usage.

SAVE_PARALLEL_IMG

The SAVE_PARALLEL_IMG indicates level of parallelism of saving images for each DPUCVDX8G batch handler. Each instance uses one AXI4 interface for data transmission. The save module uses the write channel of the AXI4 interface and the load module uses the read channel of the AXI4 interface.

In this release, the supported value for this parameter is two. Higher parallelism means a larger throughput for loading an image and a larger bandwidth requirement, and therefore a higher PL resource usage.

Note: SAVE_PARALLEL_IMG cannot be set larger than LOAD_PARALLEL_IMG.