The Xilinx® DPUCZDX8G is a programmable engine optimized for convolutional neural networks. It is composed of a high performance scheduler module, a hybrid computing array module, an instruction fetch unit module, and a global memory pool module. The DPUCZDX8G is a microcoded engine which uses a specialized instruction set, which allows for the efficient implementation of many convolutional neural networks. Some examples of convolutional neural networks which have been deployed include VGG, ResNet, GoogLeNet, YOLO, SSD, MobileNet, and FPN among others.
The DPUCZDX8G IP is implemented in the programmable logic (PL) of the selected Zynq® UltraScale+™ MPSoC device with direct connections to the processing system (PS). The DPUCZDX8G executes compiled microcode generated from a neural network graph, and requires accessible memory locations for input images as well as temporary and output data. A program running on the application processing unit (APU) is also required to service interrupts and coordinate data transfers.
The top-level block diagram of the DPUCZDX8G is shown in the following figure.
- APU - Application Processing Unit
- PE - Processing Engine
- DPU - Deep Learning Processing Unit