Unique Memory Model - 1.3 English

Vitis AI User Guide (UG1414)

Document ID
UG1414
Release Date
2021-02-03
Version
1.3 English

For each DPU task in this mode, all its boundary input tensors and output tensors together with its intermediate feature maps stay within one physical continuous memory buffer, which is allocated automatically while calling dpuCreateTask() to instantiate one DPU task from one DPU kernel. This DPU memory buffer can be cached in order to optimize memory access from the Arm® CPU. Because cache flushing and invalidation is handled by N2Cube, you do not need to take care of DPU memory management and cache manipulation. It is very easy to deploy models with unique memory model, which is the case for most of the Vitis™ AI samples.

For the unique memory model, you must copy the Int8 type input data after pre-processing into the boundary input tensors of the memory buffer of the DPU task. Then, you can launch the DPU task for running. This may bring additional overhead as there might be situations where the pre-processed input Int8 data already stays in a physical continuous memory buffer, which can be accessed by the DPU directly. One example is the camera-based deep learning application. The process of pre-processing each input image from the camera sensor can be accelerated by FPGA logic, such as image scaling, model normalization, and Float32-to-Int8 quantization. The log result data is then logged to the physical continuous memory buffer. With a unique memory model, this data must be copied to the DPU input memory buffer again.