Configure the Quantize Strategy - 2.5 English

Vitis AI User Guide (UG1414)

Document ID
Release Date
2.5 English

We provide some default quantize strategies, but sometimes users need to modify quantize configurations for different targets or to get better performance. For example, some target devices may need the biases to be quantized into 32 bit and some may need to quantize only part of the model. In this part we will show you how to configure the quantizer to meet your need.

Quantize Strategy

Three main configurable parts of the quantize tool are the quantize tool pipeline, what part of the model to be quantized and how to quantize them. We define all these thing in quantize_strategy. Internally, each quantize_strategy is a JSON file containing below configurations:
These configurations control the work pipeline of the quantize tool, including some optimizations during quantization, e.g. whether to fold Conv2D + BatchNorm layers, whether to perform Cross-Layer-Equalization algorithm and so on. It can be further divided into optimize_pipeline_config, quantize_pipeline_config, refine_pipeline_config and finalize_pipeline_config.
These configurations control what layer types are quantizable, where to insert the quantize ops and what kind of quantize op to be inserted. It includes some layer specific configurations and user-defined global configurations.
Below is an example configuration for the Conv2D layers:
  "layer_type": "tensorflow.keras.layers.Conv2D",
  "quantizable_weights": ["kernel"],
  "weight_quantizers": [
    "quantizer_type": "Pof2SQuantizer",
    "quantizer_params": {"bit_width": 8,"method":0, "round_mode": 1, "symmetry": true, "per_channel": true, "channel_axis": -1, "narrow_range": False}
  "quantizable_biases": ["bias"],
  "bias_quantizers": [
    "quantizer_type": "Pof2SQuantizer",
    "quantizer_params": {"bit_width": 8,"method":0, "round_mode": 1, "symmetry": true, "per_channel": false, "channel_axis": -1, "narrow_range": False}
  "quantizable_activations": ["activation"],
  "activation_quantizers": [
    "quantizer_type": "FSQuantizer",
    "quantizer_params": {"bit_width": 8, "method":2, "method_percentile":99.9999, "round_mode": 1, "symmetry": true, "per_channel": false, "channel_axis": -1}
As you can see, by using this quantize configuration, we quantize the weight, bias and activations of the Conv2D layer. The weight and bias are using `Pof2SQuantizer`(power-of-2 scale quantizer) and the activation are using `FSQuantizer`(float scale quantizer). We can apply different quantizers for different objects in one layer.
Note: The Quantizer here in configurations means the quantize operation applied to each object. It consumes a float tensor and output a quantized tensor. Please note that the quantization is 'fake', which means that the input is quantized to int and then de-quantized to float. 

Using Built-in Quantize Strategy

Users can use dump_quantize_strategy to see get the JSON file of current quantize strategy. To make things simple, we provide 4 types of built-in quantize strategies for common user cases which users can extend or override for their need, including:
  • pof2s: power-of-2 scale quantization, mainly used for DPU targets now. Default quantize strategy of the quantizer.
  • pof2s_tqt: power-of-2 scale quantization with trained thresholds, mainly used for QAT in DPU now.
  • fs: float scale quantization, mainly used for devices supporting floating-point calculation, such as CPU/GPUs.
  • fsx: trained quantization threshold for power-of-2 scale quantization, mainly used for QAT for DPU now.

Users can switch between the built-in quantize strategies by assigning quantize_strategy argument in the contruct function of VitisQuantizer. Moreover, we provide 2 handy ways to configure the quantize strategy.

Configure by kwargs in VitisQuantizer.quantize_model()

This is a easy way for users who need to override the default pipeline configurations or do global modifications on the quantize operations. The kwargs here is a dict object which keys match the quantize configurations in the JSON file. See vitis_quantize.VitisQuantizer.quantize_model for more information about available keys.

Example codes below shows how to use it.

model = tf.keras.models.load_model(‘float_model.h5’)
from tensorflow_model_optimization.quantization.keras import vitis_quantize
quantizer = vitis_quantize.VitisQuantizer(model)

In this example, we configure the quantizer to quantize part of the model. Layers before conv2 will be not be optimized or quantized. Moreover, we quantize all the activations and biases to 32 bit instead of 8bit, and use per_channel quantization for all weights.

Configure by VitisQuantizer.set_quantize_strategy()

For advanced users who want fully control of the quantize tool, we provide this API to set new quantize strategies JSON file. Users can first dump the current configurations to JSON file and make modifications on the it. This allows users to override the default configurations, make more fine-grained quantizer configurations or extend the quantize config to make more layer types quantizable. Then the user can set the new JSON file to the quantizer to apply these modifications.

Example codes below shows how to do it.
quantizer = VitisQuantizer(model)
# Dump the current quantize strategy
quantizer.dump_quantize_strategy(dump_file='my_quantize_strategy.json', verbose=0)

# Make modifications of the dumped file 'my_quantize_strategy.json'
# Then, set the modified json to the quantizer and do quantization
Note: verbose is an int type argument which controls the verbosity of the dumped JSON file. Greater verbose value will dump more detailed quantize strategy. Setting verbose to value greater or equal to 2 will dump the full quantize strategy.