Compiling for DPU - 1.3 English

Vitis AI User Guide (UG1414)

Document ID
UG1414
Release Date
2021-02-03
Version
1.3 English

The XIR based compiler takes the quantized TensorFlow or Caffe model as the input. First, it transforms the input models into the XIR format as the foundation of the following processes. Most of the variations among different frameworks are eliminated and transferred to a unified representation in XIR. Then, it applies various optimizations on the graph and breaks up the graph into several subgraphs on the basis of whether the op can be executed on the DPU. More architecture-aware optimizations are applied for each subgraph, as required. For the DPU subgraph, the compiler generates the instruction stream and attaches to it. Finally, the optimized graph with the necessary information and instructions for VART is serialized into a compiled xmodel file.

The XIR-based compiler can support the DPUCZDX8G series on the Edge ZCU platforms, DPUCAHX8H on the Alveo HBM platform optimized for high-throughput applications, DPUCAHX8L on the Alveo HBM platform optimized for low-latency applications, DPUCVDX8G on the Versal Edge platform, and DPUCVDX8H on the Versal Cloud platform. You can find thearch.json files for those platforms in /opt/vitis_ai/compiler/arch.

Steps to compile Caffe or TensorFlow models with VAI_C are the same as for the previous DPUs. It is assumed that you have successfully installed the Vitis AI package including VAI_C and compressed your model with vai_quantizer.

Caffe

For Caffe, vai_q_caffe generates a PROTOTXT (deploy.prototxt) and a MODEL (deploy.caffemodel). Ensure that you specify the -keep_fixed_neuron option for vai_q_caffe which is essential for XIR-based compiler. Run the following command to get the compiled xmodel.

vai_c_caffe -p /PATH/TO/deploy.prototxt -c /PATH/TO/deploy.caffemodel -a /PATH/TO/arch.json -o /OUTPUTPATH -n netname

The compiler creates three files in OUTPUTPATH directory. netname_org.xmodel is the pre-compiled xmodel which is generated by compiler frontend. netname.xmodel is the compiled xmodel which contains instructions and other necessary information. meta.json is for runtime.

TensorFlow

For TensorFlow, vai_q_tensorflow generates a pb file (quantize_eval_model.pb). Notice that there are two pb files generated by vai_q_tensorflow. The quantize_eval_model.pb file is theinput file for the XIR-based compiler. The compilation command is similar.

vai_c_tensorflow -f /PATH/TO/quantize_eval_model.pb -a /PATH/TO/arch.json -o /OUTPUTPATH -n netname

The outputs is the same as the output for Caffe.

Sometimes, the TensorFlow model does not contain input tensor shape information, which will fail the compilation. You can specify the input tensor shape with an extra option like --options '{"input_shape": "1,224,224,3"}'.

TensorFlow 2.x

For TensorFlow 2.x, the quantizer generates the quantized model in the hdf5 format.

vai_c_tensorflow2 -m /PATH/TO/quantized.h5 -a /PATH/TO/arch.json -o /OUTPUTPATH -n netname

Currently, vai_c_tensorflow2 only supports Keras functional APIs. Sequential APIs will be supported in future releases.

PyTorch

For PyTorch, the quantizer NNDCT outputs the quantized model in the XIR format directly. Use vai_c_xir to compile it.

vai_c_xir -x /PATH/TO/quantized.xmodel -a /PATH/TO/arch.json -o /OUTPUTPATH -n netname