Take the PyTorch version as an example.
- Run the following command with
--quant_mode calib
to quantize the model.python quant_lstm.py --quant_mode calib --subset_len 1000
When calibrating forward, borrow the float evaluation flow to minimize code change from the float script. If there are loss and accuracy messages displayed in the end, ignore them.Note: Check the colorful log messages with the special keyword, "NNDCT." If this quantization command runs successfully, two important files are generated in the ./quantize_result output directory.- Lstm_StandardLstmCell_layer_0_forward.py
- Converted format model
- quant_info.json
- Quantization steps for tensors. (Keep it to evaluate the quantized model.)
- To evaluate the quantized model, run the following
command:
python quant_lstm.py --quant_mode test --subset_len 1000
- The accuracy displayed after the command executes successfully is the right
accuracy for the quantized model. The Xmodel file for the compiler is generated
in the output directory, ./quantize_result/xmodel.
Lstm_StandardLstmCell_layer_0_forward_int.xmodel: deployed model