In VAI2.5 release, Pytorch model and Tensorflow2 model with custom op are supported. The basic workflow for custom op is shown below.
- Define the OP as a custom OP which is unknown to XIR and then quantize the model.
- Compile the quantized model.
- Register and implement the custom OP.
- Deploy the model with graph_runner APIs
For the step 4, graph_runner APIs support both C++ and Python. When using the Graph_runner API to deploy Custom OP, its runtime has been optimized, including Zero-copy technology between different DPU OPs and CPU OPs. It means address sharing between different layers without data copying.
The following model structure is supported by Zero copy.
|Type||Output of OP||Input of OP||Using Zero copy|
|a||Single dpu OP||Single cpu OP||Yes|
|b||Single cpu OP||Single dpu OP||Yes|
|c||Single cpu OP||Single cpu OP||Yes|
|d||Single dpu OP||Multiple cpu OP||Yes|
|e||Multiple cpu OP and multiple dpu OP||Single cpu OP||Yes|
- MNIST model based on Tensorflow2
- Pointpillars model based on Pytorch