vai_q_pytorch Quantize Finetuning Requirements - 1.3 English

Vitis AI User Guide (UG1414)

Document ID
UG1414
Release Date
2021-02-03
Version
1.3 English

Generally, there is a small accuracy loss after quantization, but for some networks such as MobileNets, the accuracy loss can be large. In this situation, first try fast finetune. If fast finetune still does not get satisfactory results, quantize finetuning can be used to further improve the accuracy of quantized models.

The quantize finetuning APIs have some requirements for the model to be trained.

  1. All operations to be quantized must be an instance of torch.nn.Module object, rather than torch functions or Python operators. For example, it is common to use ‘+’ to add two tensors in PyTorch, however, this is not supported in quantize finetuning. Thus, replace ‘+’ with pytorch_nndct.nn.modules.functional.Add. A list of operations that need replacement is shown in the following table.
    Table 1. Operation-Replacement Mapping
    Operation Replacement
    + pytorch_nndct.nn.modules.functional.Add
    - pytorch_nndct.nn.modules.functional.Sub
    torch.add pytorch_nndct.nn.modules.functional.Add
    torch.sub pytorch_nndct.nn.modules.functional.Sub
    Important: A module to be quantized cannot be called multiple times in the forward pass.
  2. Use pytorch_nndct.nn.QuantStub and pytorch_nndct.nn.DeQuantStub at the beginning and end of the network to be quantized. The network can be the whole complete network or a partial sub-network.