C++ APIs - 1.4 English

The following Vitis AI advanced low-level C++ programming APIs are briefly summarized.

Name

libn2cube.so

Description

DPU runtime library

Routines

dpuOpen(): Open & initialize the usage of DPU device
dpuClose(): Close & finalize the usage of DPU device
dpuLoadKernel(): Load a DPU Kernel and allocate DPU memory space for its Code/Weight/Bias segments
dpuDestroyKernel(): Destroy a DPU Kernel and release its associated resources
dpuCreateTask(): Instantiate a DPU Task from one DPU Kernel, allocate its private working memory buffer and prepare for its execution context
dpuRunTask(): Launch the running of DPU Task
dpuDestroyTask(): Remove a DPU Task, release its working memory buffer and destroy associated execution context
dpuSetTaskPriority(): Dynamically set a DPU Task's priority to a specified value at runtime. Priorities range from 0 (the highest priority) to 15 (the lowest priority). If not specified, the priority of a DPU Task is 15 by default.
dpuGetTaskPriority(): Retrieve a DPU Task's priority.
dpuSetTaskAffinity(): Dynamically set a DPU Task's affinity over DPU cores at runtime. If not specified, a DPU Task can run over all the available DPU cores by default.
dpuGetTaskAffinity(): Retrieve a DPU Task's affinity over DPU cores.
dpuEnableTaskDebug(): Enable dump facility of DPU Task while running for debugging purpose
dpuEnableTaskProfile(): Enable profiling facility of DPU Task while running to get its performance metrics
dpuGetTaskProfile(): Get the execution time of DPU Task
dpuGetNodeProfile(): Get the execution time of DPU Node
dpuGetInputTensorCnt(): Get total number of input Tensor of one DPU Task
dpuGetInputTensor(): Get input Tensor of one DPU Task
dpuGetInputTensorAddress(): Get the start address of one DPU Task’s input Tensor
dpuGetInputTensorSize(): Get the size (in byte) of one DPU Task’s input Tensor
dpuGetInputTensorScale(): Get the scale value of one DPU Task’s input Tensor
dpuGetInputTensorHeight(): Get the height dimension of one DPU Task’s input Tensor
dpuGetInputTensorWidth(): Get the width dimension of one DPU Task’s input Tensor
dpuGetInputTensorChannel(): Get the channel dimension of one DPU Task’s input Tensor
dpuGetOutputTensorCnt(): Get total number of output Tensor of one DPU Task
dpuGetOutputTensor(): Get output Tensor of one DPU Task
dpuGetOutputTensorAddress(): Get the start address of one DPU Task’s output Tensor
dpuGetOutputTensorSize(): Get the size in byte of one DPU Task’s output Tensor
dpuGetOutputTensorScale(): Get the scale value of one DPU Task’s output Tensor
dpuGetOutputTensorHeight(): Get the height dimension of one DPU Task’s output Tensor
dpuGetOutputTensorWidth(): Get the width dimension of one DPU Task’s output Tensor
dpuGetOutputTensorChannel(): Get the channel dimension of one DPU Task’s output Tensor
dpuGetTensorSize(): Get the size of one DPU Tensor
dpuGetTensorAddress(): Get the start address of one DPU Tensor
dpuGetTensorScale(): Get the scale value of one DPU Tensor
dpuGetTensorHeight(): Get the height dimension of one DPU Tensor
dpuGetTensorWidth(): Get the width dimension of one DPU Tensor
dpuGetTensorChannel(): Get the channel dimension of one DPU Tensor
dpuSetInputTensorInCHWInt8(): Set DPU Task’s input Tensor with data stored under Caffe order (channel/height/width) in INT8 format
dpuSetInputTensorInCHWFP32(): Set DPU Task’s input Tensor with data stored under Caffe order (channel/height/width) in FP32 format
dpuSetInputTensorInHWCInt8(): Set DPU Task’s input Tensor with data stored under DPU order (height/width/channel) in INT8 format
dpuSetInputTensorInHWCFP32(): Set DPU Task’s input Tensor with data stored under DPU order (channel/height/width) in FP32 format
dpuGetOutputTensorInCHWInt8(): Get DPU Task’s output Tensor and store them under Caffe order (channel/height/width) in INT8 format
dpuGetOutputTensorInCHWFP32(): Get DPU Task’s output Tensor and store them under Caffe order (channel/height/width) in FP32 format
dpuGetOutputTensorInHWCInt8(): Get DPU Task’s output Tensor and store them under DPU order (channel/height/width) in INT8 format
dpuGetOutputTensorInHWCFP32(): Get DPU Task’s output Tensor and store them under DPU order (channel/height/width) in FP32 format
dpuRunSoftmax (): Perform softmax calculation for the input elements and save the results to output memory buffer.
dpuSetExceptionMode(): Set the exception handling mode for edge DPU runtime N2Cube.
dpuGetExceptionMode(): Get the exception handling mode for runtime N2Cube.
dpuGetExceptionMessage(): Get the error message from error code (always negative value) returned by N2Cube APIs.
dpuGetInputTotalSize(): Get total size in byte for DPU task’s input memory buffer, which includes all the boundary input tensors.
dpuGetOutputTotalSize(): Get total size in byte for DPU task’s outmemory buffer, which includes all the boundary output tensors.
dpuGetBoundaryIOTensor(): Get DPU task’s boundary input or output tensor from the specified tensor name. The info of tensor names is listed out by VAI_C compiler after model compilation.
dpuBindInputTensorBaseAddress(): Bind the specified base physical and virtual addresses of input memory buffer to DPU task. It can only be used for DPU kernel compiled by VAI_C under split IO mode. Note it can only be used for DPU kernel compiled by VAI_C under split IO mode.
dpuBindOutputTensorBaseAddress(): Bind the specified base physical and virtual addresses of output memory buffer to DPU task. Note it can only be used for DPU kernel compiled by VAI_C under split IO mode.

Include File

n2cube.h