Generating the Frozen Inference Graph

Generating the Frozen Inference Graph - 3.5 English

Vitis AI User Guide (UG1414)

Document ID

UG1414

Release Date

2023-09-28

Version

3.5 English

When using TensorFlow 1.x to train a model, the process creates a folder that includes a GraphDef file (typically with a .pb or .pbtxt extension) and a set of checkpoint files. You need a single GraphDef file that has been frozen or had its variables converted into inline constants for mobile or embedded deployment so everything is in one file. To handle the conversion, TensorFlow provides freeze_graph.py, which is automatically installed with the vai_q_tensorflow quantizer.

The following is an example of command-line usage:

[docker] $ freeze_graph \
    --input_graph  /tmp/inception_v1_inf_graph.pb \
    --input_checkpoint  /tmp/checkpoints/model.ckpt-1000 \
    --input_binary  true \
    --output_graph  /tmp/frozen_graph.pb \
    --output_node_names  InceptionV1/Predictions/Reshape_1
`

The –input_graph should be an inference graph other than the training graph. Because the operations of data preprocessing and loss functions are not required for inference and deployment, the frozen_graph.pb should only include the essential components of the model. Particularly, the Input_fn should take in the data pre-processing operations to generate correct input data for post-training quantization.

Note: Some operations, such as dropout and batch norm, behave differently in the training and inference phases. Ensure that they are in the inference phase when freezing the graph. For example, you can set the flag is_training=false when using tf.layers.dropout/tf.layers.batch_normalization. For models using tf.keras, call tf.keras.backend.set_learning_phase(0) before building the graph.

Tip: Type freeze_graph --help for more options.

The input and output node names vary depending on the model, but you can inspect and estimate them with the vai_q_tensorflow quantizer. See the following example code snippet:

[docker] $ vai_q_tensorflow inspect --input_frozen_graph=/tmp/inception_v1_inf_graph.pb

The estimated input and output nodes cannot be used for quantization if the graph has in-graph pre- and post-processing. This is because some operations cannot be quantized and can cause errors when you compile the model with the Vitis AI compiler and deploy it to the DPU.

Another way to get the input and output names of the graph is by visualizing the graph. Both TensorBoard and Netron can do this. See the following example that uses Netron:

[docker] $ pip install netron
[docker] $ netron /tmp/inception_v3_inf_graph.pb