VGG-16 - 1.4 English

Vitis AI Optimizer User Guide (UG1333)

Document ID
Release Date
1.4 English

Baseline Model

VGG is a network for large-scale image recognition. Refer to for the architecture of the VGG16.

Create a Configuration File

Create a file named config.prototxt:

workspace: "examples/decent_p/"
gpu: "0,1,2,3"
test_iter: 100
acc_name: "top-1"
model: "examples/decent_p/vgg.prototxt"
weights: "examples/decent_p/vgg.caffemodel"
solver: "examples/decent_p/solver.prototxt"
rate: 0.1
pruner {
  method: REGULAR

Perform Model Analysis

$ ./vai_p_caffe ana –config config.prototxt

Prune the Model

$ ./vai_p_caffe prune –config config.prototxt

Finetune the Pruned Model

You can use the following solver settings to perform finetuning:

net: "vgg16/train_val.prototxt"
test_iter: 1250
test_interval: 1000
test_initialization: true
display: 100
average_loss: 100
base_lr: 0.004
lr_policy: "poly"
power: 1
gamma: 0.1
max_iter: 500000
momentum: 0.9
weight_decay: 0.0001
snapshot: 1000
snapshot_prefix: "vgg16/snapshot/res"
solver_mode: GPU
iter_size: 1

Use the following command to start finetuning:

$ ./vai_p_caffe finetune -config config.prototxt

Estimated time required: about 70 hours for 30 epochs using training set of ImageNet (ILSVRC2012) (1.2 million images, 4 x NVIDIA Tesla V100).

Get Final Output

After a few pruning iterations, a pruned model with only 33% of required operations relative to the baseline is obtained.

To finalize the model, run the following:

$ ./vai_p_caffe transform –model baseline.prototxt –weights finetuned_model.caffemodel -output

Pruning Results

ImageNet (ILSVRC2012)
Input Size
224 x 224
GPU Platform
4 x NVIDIA Tesla V100
Table 1. Pruning Results of XFPN
Round FLOPs Parameters Top-1/Top-5 Accuracy
0 100% 100% 0.7096/0.8984
1 50% 57.3% 0.7020/0.8970
2 9.7% 35.8% 0.6912/0.8913