Using Advanced APIs - 1.1 English

Vitis AI User Guide (UG1414)

Document ID
UG1414
Release Date
2020-03-23
Version
1.1 English
For edge DPU, you can utilize Vitis AI unified APIs to develop deep learning applications. In addition, they have another choice to adapt advanced low-level APIs to flexibly meet various scenarios' requirements. Note that you need to adopt legacy DNNDK N2Cube runtime as as to using such advanced APIs.For more details on advanced API usage, see Advanced Programming Interface .

For Vitis AI advanced low-level APIs, you need to use the following operations:

  1. Call APIs to manage DPU kernels and tasks.
    • DPU kernel creation and destruction
    • DPU task creation and destruction
    • Manipulate DPU input and output tensors
  2. Deploy DPU un-supported layers/operators over the CPU side.
  3. Implement pre-processing to feed input data to DPU and implement post-processing to consume output data from DPU.
    int main(void) {
    /* DPU Kernel/Task for running ResNet-50 */
    DPUKernel* kernel;
    DPUTask* task;
    
    /* Attach to DPU device and prepare for running */
    dpuOpen();
    
    /* Create DPU Kernel for ResNet-50 */ 
    kernel = dpuLoadKernel("resnet50");
    
    /* Create DPU Task for ResNet-50 */
    task = dpuCreateTask(kernel, 0);
    
    /* Run DPU Task for ResNet-50 */
    runResnet50(task);
    
    /* Destroy DPU Task & release resources */
    dpuDestroyTask(task);
    
    /* Destroy DPU Kernel & release resources */
    dpuDestroyKernel(kernel);
    
    /* Detach DPU device & release resources */
    dpuClose();
    
    return 0;
    }
    

Use ResNet50 as an example, the code snippet for manipulating the DPU kernels and tasks are programmed within themain() function as follows. The operations inside main() include:

  • Call dpuOpen() to open the DPU device.
  • Call dpuLoadKernel() to load the DPU resnet50 kernel.
  • Call dpuCreateTask() to create a task for DPU kernel.
  • Call dpuDestroyKernel() and dpuDestroyTask() to destroy the DPU kernel and task and release resources.
  • Call dpuClose() to close the DPU device.

The image classification takes place within the runResnet50() function, which performs the following operations:

  1. Fetch an image using the OpenCV function imread() and set it as the input to the DPU kernel resnet50 by calling the dpuSetInputImage2() for Caffe model. For TensorFlow model, the users should implement the pre-processing (instead of directly using dpuSetInputImage2()) to feed input image into DPU.
  2. Call dpuRunTask() to run the task for ResNet-50 model.
  3. Perform softmax calculation on the ArmĀ® CPU with the output data from DPU.
  4. Calculate the top-5 classification category and the corresponding probability.
    Mat image = imread(baseImagePath + imageName); 
    dpuSetInputImage2(task, INPUT_NODE, image);
    dpuRunTask(task);
    /* Get FC result and convert from INT8 to FP32 format */ 
    dpuGetOutputTensorInHWCFP32(task, FC_OUTPUT_NODE, FCResult, channel); 
    CPUCalcSoftmax(FCResult, channel, softmax); 
    TopK(softmax, channel, 5, kinds);