vai_q_pytorch Fast Finetuning

vai_q_pytorch Fast Finetuning - 1.4.1 English

Vitis AI User Guide (UG1414)

Document ID

UG1414

Release Date

2021-12-13

Version

1.4.1 English

Generally, there is a small accuracy loss after quantization, but for some networks such as MobileNets, the accuracy loss can be large. In this situation, first try fast finetune. If fast finetune still does not yield satisfactory results, quantize finetuning can be used to further improve the accuracy of the quantized models.

The AdaQuant algorithm¹ uses a small set of unlabeled data. It not only calibrates the activations but also finetunes the weights. The Vitis AI quantizer implements this algorithm and call it "fast finetuning" or "advanced calibration." Though slightly slower, fast finetuning can achieve better performance than quantize calibration. Similar to quantize finetuning, each run of fast finetuning produces a different result.

Fast finetuning does not train the model, and only needs a limited number of iterations. For classification models on Imagenet dataset, 1000 images are enough. Fast finetuning only needs some modification based on the model evaluation script. There is no need to set up the optimizer for training. To use fast finetuning, a function for model forwarding iteration is needed and will be called during fast finetuning. Re-calibration with the original inference code is recommended.

You can find a complete example in the open source example.

# fast finetune model or load finetuned parameter before test 
  if fast_finetune == True:
      ft_loader, _ = load_data(
          subset_len=1024,
          train=False,
          batch_size=batch_size,
          sample_method=None,
          data_dir=args.data_dir,
          model_name=model_name)
      if quant_mode == 'calib':
          quantizer.fast_finetune(evaluate, (quant_model, ft_loader, loss_fn))
      elif quant_mode == 'test':
          quantizer.load_ft_param()

For parameter finetuning and re-calibration of this ResNet18 example, run the following command:

python resnet18_quant.py --quant_mode calib --fast_finetune

To test finetuned quantized model accuracy, run the following command:

python resnet18_quant.py --quant_mode test --fast_finetune

Note:

Itay Hubara et.al., Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming, arXiv:2006.10518, 2020.