vai_q_caffe Usage - 1.2 English

Vitis AI User Guide (UG1414)

Document ID
UG1414
Release Date
2020-07-21
Version
1.2 English

The vai_q_caffe quantizer takes a floating-point model as an input model and uses a calibration dataset to generate a quantized model. In the following command line, [options] stands for optional parameters.

vai_q_caffe quantize -model float.prototxt -weights float.caffemodel [options]

The options supported by vai_q_caffe are shown in the following table. The three most commonly used options are weights_bit, data_bit, and method.

Table 1. vai_q_caffe Options List
Name Type Optional Default Description
model String Required -

Floating-point prototxt file (such as

float.prototxt).

weights String Required - The pre-trained floating-point weights (such as float.caffemodel).
weights_bit Int32 Optional 8 Bit width for quantized weight and bias.
data_bit Int32 Optional 8 Bit width for quantized activation.
method Int32 Optional 1

Quantization methods, including 0 for non-overflow and 1 for min-diffs.

The non-overflow method ensures that no values are saturated during quantization. It is sensitive to outliers.

The min-diffs method allows saturation for quantization to achieve a lower quantization difference. It is more robust to outliers and usually results in a narrower range than the non-overflow method.

calib_iter Int32 Optional 100 Maximum iterations for calibration.
auto_test - Optional Absent Adding this option will perform testing after calibration using a test dataset specified in the prototxt file.
test_iter Int32 Optional 50 Maximum iterations for testing.
output_dir String Optional quantize_results Output directory for the quantized results.
gpu String Optional 0 GPU device ID for calibration and test.
ignore_layers String Optional none List of layers to ignore during quantization.
ignore_layers_file String Optional none Protobuf file which defines the layers to ignore during quantization, starting with ignore_layers
sigmoided_layers String Optional none List of layers before sigmoid operation, to be quantized with optimization for sigmoid accuracy
input_blob String Optional data Name of input data blob
keep_fixed_neuron Bool Optional FALSE Remain FixedNeuron layers in the deployed model. Set this flag if your targeting hardware platform is DPUCAHX8H
Example:
  1. quantize:                           vai_q_caffe quantize -model float.prototxt -weights float.caffemodel -gpu 0
  2. quantize with auto test:            vai_q_caffe quantize -model float.prototxt -weights float.caffemodel -gpu 0 -auto_test -test_iter 50
  3. quantize with Non-Overflow method:  vai_q_caffe quantize -model float.prototxt -weights float.caffemodel -gpu 0 -method 0
  4. finetune quantized model:           vai_q_caffe finetune -solver solver.prototxt -weights quantize_results/float_train_test.caffemodel -gpu 0
  5. deploy quantized model:             vai_q_caffe deploy -model quantize_results/quantize_train_test.prototxt -weights quantize_results/float_train_test.caffemodel -gpu 0