This section describes how to deploy the quantized ONNX model on the Edge board.
In Vitis AI 3.0, the Vitis AI ONNX Runtime Engine (VOE) is supported and Vitis AI EP is provided to accelerate the inference with Xilinx DPU. The following is the overview of the ONNX Runtime Engine in Vitis AI.
Figure 1. ONNX Runtime Overview

More than 10 deployment examples that are based on the ONNX Runtime are provided in Vitis AI 3.0. You can find the examples in the samples_onnx folder. To deploy the ONNX model using Vitis AI, follow these steps:
- Git clone the corresponding Vitis AI Library from https://github.com/Xilinx/Vitis-AI.
- Install the cross-compilation system on the host side. Refer to Installation for instructions.
- Prepare the quantized model in ONNX format. Use the Vitis AI Quantizer to quantize the model and output the quantized model in the ONNX format.
- Download the ONNX runtime package vitis_ai_2022.2-r3.0.0.tar.gz and
install it on the target board.
tar -xzvf vitis_ai_2022.2-r3.0.0.tar.gz -C /
- Use the ONNX Runtime API to create the application program. C++ APIs of ONNX
Runtime are supported. To know more about the ONNX Runtime API, see the ONNX Runtime API docs. The programming flow is shown
below.
//Create a session //Select a set of execution provides(EP) if any, "VITISAI_EP" is selected Ort::Env env(ORT_LOGGING_LEVEL_WARNING, "resnet50_pt"); auto session_options = Ort::SessionOptions(); session_options.EnableProfiling("profile_resnet50_pt"); CheckStatus( OrtSessionOptionsAppendExecutionProvider_VITISAI(session_options, json_config.c_str())); auto session = Ort::Experimental::Session(env, model_name, session_options); //Do the pre-process and set the input preprocess_resnet50(g_image_files, input_tensor_values, input_shape); std::vector<Ort::Value> input_tensors; input_tensors.push_back(Ort::Experimental::Value::CreateTensor<float>( input_tensor_values.data(), input_tensor_values.size(), input_shape)); //Run the session auto output_tensors = session.Run(session.GetInputNames(), input_tensors, session.GetOutputNames()); //Get the output and do the post-process auto output_shape = output_tensors[0].GetTensorTypeAndShapeInfo().GetShape(); postprocess_resnet50(g_image_files, output_tensors[0]);
- Create a build.sh file as shown below,
or copy one from the Vitis AI Library ONNX
examples and modify
it.
result=0 && pkg-config --list-all | grep opencv4 && result=1 if [ $result -eq 1 ]; then OPENCV_FLAGS=$(pkg-config --cflags --libs-only-L opencv4) else OPENCV_FLAGS=$(pkg-config --cflags --libs-only-L opencv) fi lib_x=" -lglog -lunilog -lvitis_ai_library-xnnpp -lvitis_ai_library-model_config -lprotobuf -lxrt_core -lvart-xrt-device-handle -lvaip-core -lxcompiler-core -labsl_city -labsl_low_level_hash -lvart-dpu-controller -lxir -lvart-util -ltarget-factory -ljson-c" lib_onnx=" -lonnxruntime" lib_opencv=" -lopencv_videoio -lopencv_imgcodecs -lopencv_highgui -lopencv_imgproc -lopencv_core " inc_x=" -I=/usr/include/onnxruntime -I=/install/Release/include/onnxruntime -I=/install/Release/include -I=/usr/include/xrt " link_x=" -L=/install/Release/lib" name=$(basename $PWD) CXX=${CXX:-g++} $CXX -O2 -fno-inline -I. \ ${inc_x} \ ${link_x} \ -o ${name}_onnx -std=c++17 \ $PWD/${name}_onnx.cpp \ ${OPENCV_FLAGS} \ ${lib_opencv} \ ${lib_x} \ ${lib_onnx}
- Cross-compile the
program.
sh -x build.sh
- Copy the executable program and the quantized ONNX model to the target board
using the
scp
command. - Execute the program on the target board. Before running the program, ensure
that the target board has the Vitis AI
Library installed, and prepare the images you want to
test.
./resnet50_onnx <Onnx model> <image>
Note: For the ONNX model deployment, the input model is the quantized ONNX model. The model is compiled online when you run the program.