The XIR-based compiler takes the quantized TensorFlow or Caffe model as the input. First, it transforms the input models into the XIR format as the foundation for the following processes. Most of the variations among different frameworks are eliminated and transferred to a unified representation in XIR. Then, it applies various optimizations to the graph and breaks up the graph into several subgraphs on the basis of whether the operation can be executed on the DPU. Architecture-aware optimizations are applied for each subgraph, as required. For the DPU subgraph, the compiler generates the instruction stream and attaches to it. Finally, the optimized graph with the necessary information and instructions for VART is serialized into a compiled xmodel file.
The XIR-based compiler can support the DPUCZDX8G series on the Edge Zynq UltraScale+ MPSoC platforms, DPUCADF8H on the Alveo platform, DPUCAHX8H on the Alveo HBM platform optimized for high-throughput applications, DPUCVDX8G on the Versal Edge platform, and DPUCVDX8H on the Versal Cloud platform. You can find thearch.json files for these platforms in /opt/vitis_ai/compiler/arch.
Steps to compile Caffe or TensorFlow models with VAI_C are the same as for the previous DPUs. It is assumed that you have successfully installed the Vitis AI package including VAI_C and compressed your model with the vai_quantizer.
For Caffe, vai_q_caffe generates a prototxt (deploy.prototxt) and a model (deploy.caffemodel). Ensure that you specify the
-keep_fixed_neuron option for vai_q_caffe because it
is essential for the XIR-based compiler. Run the following command to get the
vai_c_caffe -p /PATH/TO/deploy.prototxt -c /PATH/TO/deploy.caffemodel -a /PATH/TO/arch.json -o /OUTPUTPATH -n netname
The compiler creates three files in OUTPUTPATH directory. netname_org.xmodel is the pre-compiled xmodel which is generated by the compiler. netname.xmodel is the compiled xmodel which contains instructions and other necessary information. meta.json is for the Vitis AI runtime.
For TensorFlow, vai_q_tensorflow generates a pb file (quantize_eval_model.pb). There are two pb files generated by vai_q_tensorflow. The quantize_eval_model.pb file is the input file for the XIR-based compiler. The compilation command is as follows.
vai_c_tensorflow -f /PATH/TO/quantize_eval_model.pb -a /PATH/TO/arch.json -o /OUTPUTPATH -n netname
The outputs is the same as the output for Caffe.
Sometimes, the TensorFlow model does not contain input tensor shape
information because it might cause the compilation to fail. You can specify the
input tensor shape with an extra option like
For TensorFlow 2.x, the quantizer generates the quantized model in the hdf5 format.
vai_c_tensorflow2 -m /PATH/TO/quantized.h5 -a /PATH/TO/arch.json -o /OUTPUTPATH -n netname
Currently, vai_c_tensorflow2 only supports Keras functional APIs.
For PyTorch, the quantizer NNDCT outputs the quantized model in the XIR format directly. Use vai_c_xir to compile it.
vai_c_xir -x /PATH/TO/quantized.xmodel -a /PATH/TO/arch.json -o /OUTPUTPATH -n netname