Pruning a Model - 2.0 English

Vitis AI Optimizer User Guide (UG1333)

Document ID
Release Date
2.0 English

To prune a model, follow these steps:

  1. Define a function to evaluate model performance. The function must satisfy two requirements:
    • The first argument must be an keras.Model instance to be evaluated
    • Returns a Python number to indicate the performance of the model
    def evaluate(model):
      model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
      score = model.evaluate(x_test, y_test, verbose=0)
      return score[1] 
  2. Use this evaluation function to run model analysis:
  3. Determine a pruning ratio. The ratio indicates the reduction in the amount of floating-point computation of the model in forward pass.
    [MACs of pruned model] = (1 – ratio) * [MACs of original model]

    The value of ratio should be in (0, 1):

    sparse_model = runner.prune(ratio=0.2)
    Note: ratio is only an approximate target value and the actual pruning ratio may not be exactly equal to this value.

The returned model from prune() is sparse which means the pruned channels are set to zeros and model size remains unchanged. The sparse model has been used in the iterative pruning process. The sparse model is converted to a pruned dense model only after pruning is completed.

Besides returning a sparse model, the pruning runner generates a specification file in the .vai directory that describes how each layer will be pruned.