(Optional) Dumping the Simulation Results - 2.5 English

Vitis AI User Guide (UG1414)

Document ID
Release Date
2.5 English
Sometimes after deploying the quantized model, it is necessary to compare the simulation results on the CPU/GPU and the output values on the DPU. You can use the VitisQuantizer.dump_model API of vai_q_tensorflow2 to dump the simulation results with the quantized model.
from tensorflow_model_optimization.quantization.keras import vitis_quantize 
quantized_model = keras.models.load_model('./quantized_model.h5') vitis_quantize.VitisQuantizer.dump_model(model=quantized_model, 
Note: The batch_size of the dump_dataset should be set to the same batch_size on target device for DPU debugging. It is recommended to use CPU simulation results for DPU debugging since GPU results can be non-deterministic and slightly different for float value computation.

Dump results are generated in ${dump_output_dir} after the command has successfully executed. Results for weights and activation of each layer are saved separately in the folder. For each quantized layer, results are saved in *.bin and *.txt formats. If the output of the layer is not quantized (such as for the softmax layer), the float activation results are saved in the *_float.bin and *_float.txt files. The / symbol is replaced by _ for simplicity. Examples for dumping results are shown in the following table.

Table 1. Example of Dumping Results
Batch No. Quantized Layer Name Saved files
Weights Biases Activation
1 Yes resnet_v1_50/conv1







2 No resnet_v1_50/softmax N/A N/A



Note: The rounding mode in implementation of DPU is "HALF_UP" for all inputs and activations. Using other rounding modes in your implementation may lead to slight bit-level mismatch with dump results.