After the model is deployed on edge DPU, perhaps the running results are not as desired, running into a lower accuracy issue. Under this situation, the users should first check the model’s accuracy after quantized by Vitis AI quantizer. If this is fine, then two suspected points are left to be further debugged. One possible point is related to the deployment source code, which should be checked very carefully. The other possible point is related to DPU execution itself. This section focuses on the illustrations about debugging the DPU running result. Normally, it involves the following five steps.
- Run Vitis AI quantizer to generate the golden baseline from the quantized model.
- Build the model as debug mode DPU kernel by Vitis AI compiler with option
- Before launching the running of DPU application, run command
dexplorer -mdebug to switch runtime N2Cube into debug mode, or calling
dpuEnableTaskDebug()to enable debug mode for the dedicated DPU task only (other tasks will not be affected).
- Run the DPU application and get the raw dump data for DPU task’s each node.
- Compare DPU raw dump data with the golden baseline from quantizer.
debugging is delivered within
Vitis AI package to demonstrate how to debug the
DPU. TensorFlow Inception-v1 model is deployed within this sample and there are two
dpu_deployment. The folder
holds all the required files to generate golden baseline together with the evaluation
version model cannot be used) generated by quantizer. Run script decent_dump_golden.sh to dump the golden baseline for the
decent_golden/dataset/images/cropped_224x224.jpg and save into the folder
DECENT_DEBUG=5 vai_q_caffe test -model quantize_model/quantize_train_test.prototxt \
-weights quantize_model/quantize_train_test.caffemodel \
-test_iter 1 \
2>&1 | tee ./log/dump.log
--dump fused_graph_info specified
to Vitis AI compiler, while compiling Inception-v1 model, one file named
fused_graph_kernel_0.txt will be produced with DPU kernel
dpu_tf_inception_v1_0. The folder
dpu_deployment holds the deployment source code for
dpuEnableTaskDump() is used to enable DPU raw data
dumping. After going through the code in source file main.cc, it can be noticed that
pre-processing and post-processing for Inception-v1 model are not included, which is
helpful for isolating the affections of deployment code during debugging DPU. The file
fused_graph_kernel_0.txt describes the mapping relationship
between DPU node (or super-layer), which may contain several fused layers or operators,
and the quantized model’s layers or operators, which are divided into two types, in and
out. For Caffe model, the layers’ names are identical with the original floating-point
model. For TensorFlow model, the operators’ names are slightly different from the
original floating-point model because Vitis AI quantizer performs some operators’
fusion. With the name of the quantized model’s layer or operator, the users can locate
its corresponding dump files from quantizer golden baseline.
dpu_tf_inception_v1_0.elf of TensorFlow Inception-v1 model,
the mapping information for its input node
input and output node
InceptionV1_Logits_Conv2d_0c_1x1_Conv2D is shown below. For input
input, its out operator is input. And for output node
InceptionV1_Logits_Conv2d_0c_1x1_Conv2D, its out operator is
For out type operator input, its corresponding text format dump file from
Vitis AI quantizer is
_aquant_int8 is the
added suffix), which can be found from
decent_golden/dump_golden/dump_results_0/. Feed Int8 type input data
input_aquant_int8.txt into DPU input node
input. After compiling and running this DPU application, raw data for each DPU node will
be dumped into a folder like dump_2134 (number 2134 is process ID). For the last DPU
locate the DPU Int8 type running result within the file
tf_inception_v1_0_ is the kernel name. And suffix
out0 indicates that it is the first output tensor for this DPU node). For the last DPU
InceptionV1_Logits_Conv2d_0c_1x1_Conv2D, use its
out type operator
InceptionV1_Logits_Conv2d_0c_1x1_Conv2D to find the golden output file from
quantizer. Quantizer may fuse operators during performing quantization for TensorFlow
model. For Inception-v1 model, you can find the similar name dump file
(Conv2d and BiasAdd are two adjacent operators within model. _aquant_int8 is the added
suffix). Lastly, check to see if DPU output of
quantizer output of
InceptionV1_Logits_Conv2d_0c_1x1_BiasAdd_aquant_int8.bin are equal or not.
If they are the same then it can be confirmed that Inception-v1 runs well over DPU, as
expected. Nevertheless, potential issues exist related to DPU execution. Contact
Xilinx and report bugs.