DPU-V1 for Cloud - 1.1 English

Vitis AI User Guide (UG1414)

Document ID

UG1414

Release Date

2020-03-23

Version

1.1 English

DPU-V1 (previously known as xDNN) IP cores are high performance general CNN processing engines (PE).

Figure 1. DPU-V1 Architecture

The key features of this engine are:

96x16 DSP Systolic Array operating at 700 MHz
Instruction-based programming model for simplicity and flexibility to represent a variety of custom neural network graphs.
9 MB on-chip Tensor Memory composed of UltraRAM
Distributed on-chip filter cache
Utilizes external DDR memory for storing Filters and Tensor data
Pipelined Scale, ReLU, and Pooling Blocks for maximum efficiency
Standalone Pooling/Eltwise execution block for parallel processing with Convolution layers
Hardware-Assisted Tiling Engine to sub-divide tensors to fit in on-chip Tensor Memory and pipelined instruction scheduling
Standard AXI-MM and AXI4-Lite top-level interfaces for simplified system-level integration
Optional pipelined RGB tensor Convolution engine for efficiency boost