Preparing the Float Model and Related Input Files - 1.3 English

Vitis AI User Guide (UG1414)

Document ID
UG1414
Release Date
2021-02-03
Version
1.3 English

Before running vai_q_tensorflow, prepare the frozen inference TensorFlow model in floating-point format and calibration set, including the files listed in the following table.

Table 1. Input Files for vai_q_tensorflow
No. Name Description
1 frozen_graph.pb Floating-point frozen inference graph. Ensure that the graph is the inference graph rather than the training graph.
2 calibration dataset A subset of the training dataset containing 100 to 1000 images.
3 input_fn An input function to convert the calibration dataset to the input data of the frozen_graph during quantize calibration. Usually performs data pre-processing and augmentation.

Getting the Frozen Inference Graph

In most situations, training a model with TensorFlow 1.x creates a folder containing a GraphDef file (usually ending with a .pb or .pbtxt extension) and a set of checkpoint files. What you need for mobile or embedded deployment is a single GraphDef file that has been “frozen”, or had its variables converted into inline constants so everything is in one file. To handle the conversion, TensorFlow provides freeze_graph.py, which is automatically installed with the vai_q_tensorflow quantizer.

An example of command-line usage is as follows:

$ freeze_graph \
    --input_graph  /tmp/inception_v1_inf_graph.pb \
    --input_checkpoint  /tmp/checkpoints/model.ckpt-1000 \
    --input_binary  true \
    --output_graph  /tmp/frozen_graph.pb \
    --output_node_names  InceptionV1/Predictions/Reshape_1

The –input_graph should be an inference graph other than the training graph. Because the operations of data preprocessing and loss functions are not needed for inference and deployment, the frozen_graph.pb should only include the main part of the model. In particular, the data preprocessing operations should be taken in the Input_fn to generate correct input data for quantize calibration.

Note: Some operations, such as dropout and batchnorm, behave differently in the training and inference phases. Ensure that they are in the inference phase when freezing the graph. For examples, you can set the flag is_training=false when using tf.layers.dropout/tf.layers.batch_normalization. For models using tf.keras, call tf.keras.backend.set_learning_phase(0) before building the graph.
Tip: Type freeze_graph --help for more options.

The input and output node names vary depending on the model, but you can inspect and estimate them with the vai_q_tensorflow quantizer. See the following code snippet for an example:

$ vai_q_tensorflow inspect --input_frozen_graph=/tmp/inception_v1_inf_graph.pb

The estimated input and output nodes cannot be used for quantization if the graph has in-graph pre- and post-processing. This is because some operations are not quantizable and can cause errors when compiled by the Vitis AI compiler, if you deploy the quantized model to the DPU.

Another way to get the input and output name of the graph is by visualizing the graph. Both TensorBoard and Netron can do this. See the following example, which uses Netron:

$ pip install netron
$ netron /tmp/inception_v3_inf_graph.pb

Getting the Calibration Dataset and Input Function

The calibration set is usually a subset of the training/validation dataset or actual application images (at least 100 images for performance). The input function is a Python importable function to load the calibration dataset and perform data preprocessing. The vai_q_tensorflow quantizer can accept an input_fn to do the preprocessing which is not saved in the graph. If the pre-processing subgraph is saved into the frozen graph, the input_fn only needs to read the images from dataset and return a feed_dict.

The format of input function is module_name.input_fn_name, (for example, my_input_fn.calib_input). The input_fn takes an int object as input, indicating the calibration step number, and returns a dict("placeholder_name, numpy.Array") object for each call, which is fed into the placeholder nodes of the model when running inference. The shape of numpy.array must be consistent with the placeholders. See the following pseudo code example:

$ “my_input_fn.py”
def calib_input(iter):
“””A function that provides input data for the calibration
Args:
iter: A `int` object, indicating the calibration step number
Returns:
    dict( placeholder_name, numpy.array): a `dict` object, which will be fed into the model
“””
  image = load_image(iter)
  preprocessed_image = do_preprocess(image)
  return {"placeholder_name": preprocessed_images}