vitis_quantize.VitisQuantizer.get_qat_model - 3.5 English

Vitis AI User Guide (UG1414)

Document ID
UG1414
Release Date
2023-09-28
Version
3.5 English

This function gets the float model for QAT:


vitis_quantize.VitisQuantizer.get_qat_model(
    init_quant=False,
    calib_dataset=None,
    calib_batch_size=None,
    calib_steps=None,
    train_with_bn=False,
    freeze_bn_delay=-1)

Arguments

init_quant
A bool object to notify whether or not to run initial quantization before QAT. Running an initial PTQ quantization yields an improved initial state for the quantizer parameters, especially for the 8bit_tqt strategy. Otherwise, the training might not converge.
calib_dataset
A tf.data.Dataset, keras.utils.Sequence or np.numpy object. The representative dataset for calibration. It must be set when init_quant is set to True. You can use the whole or part of eval_dataset, train_dataset, or other datasets as calib_dataset.
calib_steps
An int object. The total number of steps for the initial PTQ. It is ignored with the default value of None. If calib_dataset is a tf.data dataset, generator or keras.utils.Sequence instance and steps are None. Calibration runs until the dataset is exhausted. Array inputs do not support this argument.
calib_batch_size
An int object. The number of samples per batch for initial PTQ. If the "calib_dataset" is in the form of a dataset, generator or keras.utils.Sequence instances, the batch size is controlled by the dataset itself. If the calib_dataset is in the form of a numpy.array object, the default batch size is 32.
train_with_bn
A bool object. Indicates whether to keep bn layers during QAT. If set to True, bn parameters are updated during quantize-aware training and help the model to converge. These trained bn layers are then fused into previous convolution-like layers in the get_deploy_model() function. This option has no effect if the float model has no bn layers. The default value is false.
freeze_bn_delay
An int object. The training steps prior to freezing the bn parameters. After the delayed steps, the model switches inference bn parameters to avoid instability in training. It only takes effect when train_with_bn is True. The default value is -1, which means never performing bn freezing.