Core Overview - 1.0 English

The Xilinx® DPUCVDX8H DPU is a programmable engine optimized for convolutional neural networks, mainly for high performance applications, and it could be used to perform deep learning inference tasks such as image classification, object detection, and semantic segmentation. It is specifically designed for the Versal® platform. The architecture of the Versal platform is shown in the following figure. The Versal ACAP integrates more powerful Arm® cores, and more importantly, it integrated a new powerful programmable computing array, which is called AI Engine. It can be used as a large DSP array to accelerate general high-density computing tasks, such as 5G. The AI Engine is also optimized for machine learning tasks, compared with programmable logic, most computing tasks can get much better performance on AI Engine. The DPUCVDX8H puts most of the calculation tasks on the AI Engine, only keeping the control logic and a small amount of computation in the PL.

Figure 1. Versal SoC FPGA

This unit includes a high-performance scheduler module, a hybrid computing array module, an instruction fetch unit module, control and memory access module, and a memory pool module. The DPU uses a specialized instruction set, which supports the efficient mapping for many convolutional neural networks. Some examples of convolutional neural networks that have been deployed include VGG, ResNet, GoogLeNet, YOLO, SSD, FPN.

The convolutional computing unit of the DPU IP is implemented on AI Engine, the control and memory access unit and memory pool are implemented in the programmable logic. The DPU IP can be connected to NoC with a standard AXI interface to access DRAM and receive external control commands. You can use Xilinx shell or self-defined logic to operate DPU, including configuration/network instruction injection/handling interrupts and data movement. To simplify development and make it easy to use, Xilinx provides a platform, shell, and related tools, to support you to integrate it into the design.

The following image shows the top-level block diagram of DPU.

Figure 2. DPU Top-Level Block Diagram