The DPUCAHX8L IP is a new general-purpose CNN accelerator that is optimized for HBM cards, such as the Alveo U50/U50LV and U280 cards, and designed for low latency applications. It has a new low latency DPU micro-architecture with an HBM memory sub-system supporting 4TOPs to 5.3TOPs MAC array. It supports the back-to-back convolution and depthwise convolution engines to increase computing parallelism. It also supports hierarchical memory system, UltraRAM and HBM, to maximize data movement. With this low latency DPU IP, the Vitis AI compiler supports the super layer interface and many new compiling strategies for kernel fusion and graph partition.
Figure 1. DPUCAHX8L Architecture