Getting the Frozen Inference Graph - 1.1 English

Vitis AI User Guide (UG1414)

Document ID
UG1414
Release Date
2020-03-23
Version
1.1 English

In most situations, training a model with TensorFlow gives you a folder containing a GraphDef file (usually ending with a .pb or .pbtxt extension) and a set of checkpoint files. What you need for mobile or embedded deployment is a single GraphDef file that has been “frozen”, or had its variables converted into inline constants so everything is in one file. To handle the conversion, Tensorflow provides freeze_graph.py, which is automatically installed with the vai_q_tensorflow quantizer.

An example of command-line usage is as follows:

$ freeze_graph \
    --input_graph  /tmp/inception_v1_inf_graph.pb \
    --input_checkpoint  /tmp/checkpoints/model.ckpt-1000 \
    --input_binary  true \
    --output_graph  /tmp/frozen_graph.pb \
    --output_node_names  InceptionV1/Predictions/Reshape_1

The –input_graph should be an inference graph other than the training graph. Some operations behave differently in the training and inference, such as dropout and batchnorm; ensure that they are in inference phase when freezing the graph. For examples, you can set the flag is_training=false when using tf.layers.dropout/tf.layers.batch_normalization. For models using tf.keras, call tf.keras.backend.set_learning_phase(0) before building the graph.

Because the operations of data preprocessing and loss functions are not needed for inference and deployment, the frozen_graph.pb should only include the main part of the model. In particular, the data preprocessing operations should be taken in the Input_fn to generate correct input data for quantize calibration.

Note: Type freeze_graph --help for more options.

The input and output node names vary depending on the model, but you can inspect and estimate them with the vai_q_tensorflow quantizer. See the following code snippet for an example:

$ vai_q_tensorflow inspect --input_frozen_graph=/tmp/inception_v1_inf_graph.pb

The estimated input and output nodes cannot be used for the quantization part if the graph has in-graph pre- and postprocessing, because some operations in these parts are not quantizable and might cause errors when compiled by the Vitis AI compiler if you need to deploy the quantized model to the DPU.

Another way to get the input and output name of the graph is by visualizing the graph. Both tensorboard and netron can do this. See the following example, which uses netron:

$ pip install netron
$ netron /tmp/inception_v3_inf_graph.pb