Runner for structured pruning of the model in an iterative way. This API has the following methods:
Creates a new
- A baseline model to prune. The model should be an instance of
- A single or a list of
tf.TensorSpecis used to represent model input specifications.
ana(eval_fn, excludes=None, forced=False)
Performs model analysis. The analysis result is saved in the '.vai' directory, and this cached result is used directly in subsequent calls unless
forcedis set to True.
- Callable object that takes a
keras.Modelobject as its first argument and returns the evaluation score.
- A list of layer names or layer instances to be excluded from pruning.
- When set to True, model analysis is run instead of the cached analysis result.
prune(ratio=None, threshold=None, spec_path=None, excludes=None, mode='sparse')
Prunes the baseline model and returns a sparse model. The degree of pruning can be specified in three ways: ratio, threshold, or pruning specification. The first method is preferred; the latter two are more suitable for experiments with manual tuning.
- The expected percentage of FLOPs reduction of the baseline model. This is a guidance value, and the actual FLOPs reduction cannot strictly be equal to this value.
- Relative proportion of model performance loss between the baseline and pruned models.
- Pruning specification path is used to prune the model.
- A list of layer name or layer instance to be excluded from pruning.
- The mode in which the baseline model is pruned to return a sparse model.
Gets a slim model from a sparse model. By default, you would use the latest pruning specification to do this transformation. A specification path can be provided explicitly if the sparse model was not generated from the latest specification.
- The pruning specification path transforms a sparse model into a slim one.