Guidelines for Better Training Results - 3.5 English

Vitis AI User Guide (UG1414)

Document ID
Release Date
3.5 English

The following are some tips for getting better training results:

  • Load the pre-trained floating-point weights as initial values to start the quantization aware training if possible. It is possible to train from scratch with random initial values, but this will make training more difficult and long.
  • If pre-trained floating-point weights are loaded, then different initial learning rates and learning rate decrease strategies need to be used for the network parameters and quantizer parameters, respectively. In general, the learning rate of network parameters needs to be set small, while the learning rate of quantizer parameters needs to be larger.
    model = qat_processor.trainable_model()
    param_groups = [{
        'params': model.quantizer_parameters(),
        'lr': 1e-2,
        'name': 'quantizer'
    }, {
        'params': model.non_quantizer_parameters(),
        'lr': 1e-5,
        'name': 'weight'
    optimizer = torch.optim.Adam(param_groups)
  • For the choice of optimizer, avoid using torch.optim.SGD, as this optimizer can prevent the training from converging. AMD recommends using torch.optim.Adam or torch.optim.RMSprop and their variants.