The concept of Neural Architecture Search (NAS) is that for any given inference task and dataset, several network architectures exist in the potential design space that are both efficient and have high prediction scores. Often, a developer starts with a standard backbone that is familiar to them, such as ResNet50, and trains that network for the best accuracy. However, many cases exist when a network topology with a much lower computational cost can offer similar or better performance. For the developer, the effort to train multiple networks with the same dataset (sometimes going so far as to make this a training hyperparameter) is not an efficient method for selecting the best network topology.
NAS can be flexibly applied for each layer. The number of channels and amount of sparsity are learned by minimizing the loss of the pruned network. NAS successfully balances speed and accuracy but requires extended training times. This method requires a four-step process:
- Fine-tune (optional)
Compared with coarse-grained pruning, one-shot NAS implementations assemble multiple candidate "subnetworks" into a single, over-parameterized graph known as a Supernet. The training optimization algorithm attempts to optimize all candidate networks simultaneously using supervised learning. Upon completing this training process, candidate subnetworks are ranked based on computational cost and accuracy. The developer selects the best candidate to meet their requirements. The one-shot NAS method effectively compresses depthwise and conventional convolution models but requires a long training time and a higher skill level.