vai_q_pytorch Fast Finetuning

vai_q_pytorch Fast Finetuning - 1.3 English

Vitis AI User Guide (UG1414)

Document ID

UG1414

Release Date

2021-02-03

Version

1.3 English

Generally, there is a small accuracy loss after quantization, but for some networks such as MobileNets, the accuracy loss can be large. In this situation, first try fast finetune. If fast finetune still does not get satisfactory results, quantize finetuning can be used to further improve the accuracy of quantized models.

With a small set of unlabeled data, the AdaQuant algorithm¹ not only calibrates the activations but also finetunes the weights. AdaQuant uses a small set of unlabeled data. This is similar to calibration but it finetunes the model. The Vitis AI quantizer implements this algorithm and call it "fast finetuning" or "advanced calibration". Though slightly slower, fast finetuning can achieve better performance than quantize calibration. Similar to quantize finetuning, each run of fast finetuning produces a different result.

Fast finetuning is not real training of the model, and only needs limited number of iterations. For classification models on Imagenet dataset, 1000 images are enough. Fast finetuning only needs some modification based on the model evaluation script. There is no need to set up the optimizer for training. To use fast finetuning, a function for model forwarding iteration is needed and will be called among fast finetuning. Re-calibration with the original inference code is highly recommended.

You can find a complete example in the open source example

# fast finetune model or load finetuned parameter before test 
  if fast_finetune == True:
      ft_loader, _ = load_data(
          subset_len=1024,
          train=False,
          batch_size=batch_size,
          sample_method=None,
          data_dir=args.data_dir,
          model_name=model_name)
      if quant_mode == 'calib':
        quantizer.fast_finetune(evaluate, (quant_model, ft_loader, loss_fn))
      elif quant_mode == 'test':
        quantizer.load_ft_param()

For parameter finetuning and re-calibration of this ResNet18 example, run the following command:

python resnet18_quant.py --quant_mode calib --fast_finetune

To test finetuned quantized model accuracy, run the following command:

python resnet18_quant.py --quant_mode test --fast_finetune

Note:

Itay Hubara et.al., Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming, arXiv:2006.10518, 2020.