TensorFlow 1.x - 3.5 English

Vitis AI User Guide (UG1414)

Document ID
UG1414
Release Date
2023-06-29
Version
3.5 English

Quantization API

def quantize(
input_frozen_graph = "", 
input_nodes = "", 
input_shapes = "", 
output_nodes = "", 
input_fn = "", 
method = 1, 
calib_iter = 100, 
output_dir = "./quantize_results", 
**kargs)
This function will invoke vai_q_tensorflow command tool in WeGO TensorFlow r1.15 and converts the input floating-point model to fixed-point model for DPU deployment acceleration. To be fully compatible with native vai_q_tensorflow quantizer, all parameters received from this API will be forwarded to vai_q_tensorflow command tool directly. This function will return a quantized GraphDef object or None on failure.
Note: Only PTQ is supported now for on-the-fly quantization in WeGO. For more information on fast fine-tuning and QAT quantization, see vai_q_tensorflow Quantization Aware Training.

Parameters

input_frozen_graph
string: path to input frozen graph(.pb) (default: )
input_nodes
string: The comma-separated name list of input nodes of the subgraph to be quantized. Used together with output_nodes. When generating the model for deploy, only the subgraph between input_nodes and output_nodes will be included. Please set it to the beginning of the main body of the model to quantize, such as the nodes after data pre-processing and augmentation. (default: )
input_shapes
string: the comma-separated shape list of input_nodes. The shape must be a 4-dimension shape for each node, comma separated, for example, 1,224,224,3; Unknown size for batch size is supported, for example, ?,224,224,3; In case of multiple input_nodes, please assign the shape list of each node, separated by :, for example, ?,224,224,3:?,300,300,1 (default: )
output_nodes
string: the comma-speareted name list of output nodes of the subgraph to be quantized that is used together with input_nodes. When generating the model for deployment, only the subgraph between input_nodes and output_nodes will be included. Set it to the end of the main body of the model to quantize, such as the nodes before post-processing. (default: )
input_fn
string: the python importable function that provides the input data. The format is module_name.input_fn_name, for example, my_input_fn.input_fn. The input_fn should take a int object as input indicating the calibration step, and should return a dict (placeholder_node_name : numpy.Array) object for each call, which will be fed into the model's placeholder nodes. (default: )
method
int32: {0,1,2}, default: 1. The method for quantization, options are:
  • 0: non-overflow method. Ensure no values are saturated during quantization. It may get bad results in case of outliers.
  • 1: min-diffs method. It allows saturation for large values during quantization to get smaller quantization errors. This method is slower than method 0 but has higher endurance to outliers.
  • 2: min-diffs method with strategy for depthwise. It allows saturation for large values during quantization to get smaller quantization errors. Apply special strategy for depthwise weights, but implement method 1 to normal weights and activation. This method is slower than method 0 but has higher endurance to outliers.
calib_iter
int32: the iterations of calibration. The total number of images for calibration = calib_iter * batch_size (default: 100)
output_dir
string: the directory to save the quantization results (default: ./quantize_results).
Note: For more information on other parameters for **kargs, see vai_q_tensorflow Usage.
Note: For more information on the on-the-fly quantization examples for WeGO TensorFlow 1.x, see examples.