Pruning Overview - 1.1 English

AI Optimizer User Guide (UG1333)

Document ID
UG1333
Release Date
2020-07-07
Version
1.1 English

Most neural networks are typically over-parameterized, with significant redundancy to achieve a certain accuracy. “Pruning” is the process of eliminating redundant weights while keeping the accuracy loss as low as possible.

Figure 1. Pruning Methods

The simplest form of pruning is called “fine-grained pruning” and results in sparse weight matrices. VAI pruner employs the “coarse-grained pruning” method, which eliminates neurons that do not contribute significantly to the network’s accuracy. For convolutional layers, “coarse-grained pruning” prunes the entire 3D kernel, so it is also called channel pruning.

Pruning will always reduce the accuracy of the original model. Retraining adjusts the remaining weights to recover accuracy.