C++ APIs - 1.4 English

Vitis AI User Guide (UG1414)

Document ID
UG1414
Release Date
2021-07-22
Version
1.4 English

The following Vitis AI advanced low-level C++ programming APIs are briefly summarized.

Name

libn2cube.so

Description

DPU runtime library

Routines

dpuOpen()
Open & initialize the usage of DPU device
dpuClose()
Close & finalize the usage of DPU device
dpuLoadKernel()
Load a DPU Kernel and allocate DPU memory space for its Code/Weight/Bias segments
dpuDestroyKernel()
Destroy a DPU Kernel and release its associated resources
dpuCreateTask()
Instantiate a DPU Task from one DPU Kernel, allocate its private working memory buffer and prepare for its execution context
dpuRunTask()
Launch the running of DPU Task
dpuDestroyTask()
Remove a DPU Task, release its working memory buffer and destroy associated execution context
dpuSetTaskPriority()
Dynamically set a DPU Task's priority to a specified value at runtime. Priorities range from 0 (the highest priority) to 15 (the lowest priority). If not specified, the priority of a DPU Task is 15 by default.
dpuGetTaskPriority()
Retrieve a DPU Task's priority.
dpuSetTaskAffinity()
Dynamically set a DPU Task's affinity over DPU cores at runtime. If not specified, a DPU Task can run over all the available DPU cores by default.
dpuGetTaskAffinity()
Retrieve a DPU Task's affinity over DPU cores.
dpuEnableTaskDebug()
Enable dump facility of DPU Task while running for debugging purpose
dpuEnableTaskProfile()
Enable profiling facility of DPU Task while running to get its performance metrics
dpuGetTaskProfile()
Get the execution time of DPU Task
dpuGetNodeProfile()
Get the execution time of DPU Node
dpuGetInputTensorCnt()
Get total number of input Tensor of one DPU Task
dpuGetInputTensor()
Get input Tensor of one DPU Task
dpuGetInputTensorAddress()
Get the start address of one DPU Task’s input Tensor
dpuGetInputTensorSize()
Get the size (in byte) of one DPU Task’s input Tensor
dpuGetInputTensorScale()
Get the scale value of one DPU Task’s input Tensor
dpuGetInputTensorHeight()
Get the height dimension of one DPU Task’s input Tensor
dpuGetInputTensorWidth()
Get the width dimension of one DPU Task’s input Tensor
dpuGetInputTensorChannel()
Get the channel dimension of one DPU Task’s input Tensor
dpuGetOutputTensorCnt()
Get total number of output Tensor of one DPU Task
dpuGetOutputTensor()
Get output Tensor of one DPU Task
dpuGetOutputTensorAddress()
Get the start address of one DPU Task’s output Tensor
dpuGetOutputTensorSize()
Get the size in byte of one DPU Task’s output Tensor
dpuGetOutputTensorScale()
Get the scale value of one DPU Task’s output Tensor
dpuGetOutputTensorHeight()
Get the height dimension of one DPU Task’s output Tensor
dpuGetOutputTensorWidth()
Get the width dimension of one DPU Task’s output Tensor
dpuGetOutputTensorChannel()
Get the channel dimension of one DPU Task’s output Tensor
dpuGetTensorSize()
Get the size of one DPU Tensor
dpuGetTensorAddress()
Get the start address of one DPU Tensor
dpuGetTensorScale()
Get the scale value of one DPU Tensor
dpuGetTensorHeight()
Get the height dimension of one DPU Tensor
dpuGetTensorWidth()
Get the width dimension of one DPU Tensor
dpuGetTensorChannel()
Get the channel dimension of one DPU Tensor
dpuSetInputTensorInCHWInt8()
Set DPU Task’s input Tensor with data stored under Caffe order (channel/height/width) in INT8 format
dpuSetInputTensorInCHWFP32()
Set DPU Task’s input Tensor with data stored under Caffe order (channel/height/width) in FP32 format
dpuSetInputTensorInHWCInt8()
Set DPU Task’s input Tensor with data stored under DPU order (height/width/channel) in INT8 format
dpuSetInputTensorInHWCFP32()
Set DPU Task’s input Tensor with data stored under DPU order (channel/height/width) in FP32 format
dpuGetOutputTensorInCHWInt8()
Get DPU Task’s output Tensor and store them under Caffe order (channel/height/width) in INT8 format
dpuGetOutputTensorInCHWFP32()
Get DPU Task’s output Tensor and store them under Caffe order (channel/height/width) in FP32 format
dpuGetOutputTensorInHWCInt8()
Get DPU Task’s output Tensor and store them under DPU order (channel/height/width) in INT8 format
dpuGetOutputTensorInHWCFP32()
Get DPU Task’s output Tensor and store them under DPU order (channel/height/width) in FP32 format
dpuRunSoftmax ()
Perform softmax calculation for the input elements and save the results to output memory buffer.
dpuSetExceptionMode()
Set the exception handling mode for edge DPU runtime N2Cube.
dpuGetExceptionMode()
Get the exception handling mode for runtime N2Cube.
dpuGetExceptionMessage()
Get the error message from error code (always negative value) returned by N2Cube APIs.
dpuGetInputTotalSize()
Get total size in byte for DPU task’s input memory buffer, which includes all the boundary input tensors.
dpuGetOutputTotalSize()
Get total size in byte for DPU task’s outmemory buffer, which includes all the boundary output tensors.
dpuGetBoundaryIOTensor()
Get DPU task’s boundary input or output tensor from the specified tensor name. The info of tensor names is listed out by VAI_C compiler after model compilation.
dpuBindInputTensorBaseAddress()
Bind the specified base physical and virtual addresses of input memory buffer to DPU task. It can only be used for DPU kernel compiled by VAI_C under split IO mode. Note it can only be used for DPU kernel compiled by VAI_C under split IO mode.
dpuBindOutputTensorBaseAddress()
Bind the specified base physical and virtual addresses of output memory buffer to DPU task. Note it can only be used for DPU kernel compiled by VAI_C under split IO mode.

Include File

n2cube.h