Sometimes, after deploying the quantized model, it is essential to compare the simulation results on the CPU and GPU with the output values on the DPU.

You can use the dump_model API of vai_q_onnx to dump the simulation results with the quantized_model:

import vai_q_onnx
# This function dumps the simulation results of the quantized model,
# including weights and activation results.
File path of the quantized model.
A data reader used for the dump. It generates inputs for the original model.
String. The directory to save the dump results. Dump results are generated in output_dir after the function is successfully executed.
Boolean. Determines whether to dump the float value of weights and activation results.
Note: The batch_size of the dump_data_reader should be set to 1 for DPU debugging.

After successfully executing the command, the dump results are generated in the output_dir. Each quantized node's weights and activation results are saved separately in *.bin and *.txt formats.

In cases where the node output is not quantized, such as the softmax node, the float activation results are saved in *_float.bin and *_float.txt formats if the option "save_float" is set to True.

The following table shows examples of the dump results.

Table 1. Example of Dumping Results
Batch Number Quantized Node Name Saved files
1 Yes resnet_v1_50_conv1



2 Yes resnet_v1_50_conv1_weights



2 No resnet_v1_50_softmax