Custom OP Workflow - 3.5 English

Vitis AI User Guide (UG1414)

Document ID

UG1414

Release Date

2023-09-28

Version

3.5 English

Starting with version 2.5, Vitis AI supports Pytorch and Tensorflow2 models with custom op. The basic workflow for custom op is shown below.

Figure 1. Custom Op Workflow

The following are the steps in the workflow:

Define the OP as a custom OP unknown to XIR and then quantize the model.
Compile the quantized model.
Register and implement the custom OP.
Deploy the model with graph_runner APIs.

Step 3 supports C++ and Python to implement and register the custom OP. There are more than 50 supported common OPs by the Vitis AI library. You can find the source code of the common OPs in https://github.com/Xilinx/Vitis-AI/tree/v3.5/src/vai_library/cpu_task/ops.

Note: If you want to implement an accelerated (PL or AI Engine) function for a custom op, make it a CPU OP, but implement the PL/AI Engine calling codes in this CPU OP's implementation.

For step 4, graph_runner APIs support both C++ and Python. When using the Graph_runner API to deploy Custom OP, its runtime has been optimized, including Zero-copy technology between different DPU OPs and CPU OPs. It means address sharing between different layers without data copying.

The following model structure is supported by Zero copy.

Table 1. Model structure supported by Zero copy
Type	Output of OP	Input of OP	Using Zero copy
a	Single dpu OP	Single cpu OP	Yes
b	Single cpu OP	Single dpu OP	Yes
c	Single cpu OP	Single cpu OP	Yes
d	Single dpu OP	Multiple cpu OP	Yes
e	Multiple cpu OP and multiple dpu OP	Single cpu OP	Yes

Note: Model structure types a-e are shown in the following figure.

Figure 2. Model Structure Types

Note: The application of Zero copy for the other model structures depends on the situation.

The following are examples of the two models, respectively.

MNIST model based on Tensorflow2
Pointpillars model based on Pytorch