SqueezeNext – Hardware Aware Neural Network Design

Berkeley researchers have published ‘SqueeseNext’, the successor to SqueezeNet, in their latest attempt to distill the capabilities of very large neural networks into smaller models that can feasibly be deployed on devices with small memory and compute capabilities, like mobile phones. One of the main barriers for deploying neural networks on embedded systems has been large memory and power consumption of existing neural networks. While much of the research into AI systems today is based around getting state-of-the-art results on specific datasets, SqueezeNext is part of a parallel track focused on making systems deployable.

Need for SqueezeNext?

The transition to Deep Neural Network based solutions started with AlexNet, which won the ImageNet challenge by a large margin. The ImageNet classification challenge started in 2010 with the first winning method achieving an error rate of 28.2%, followed by 26.2% in 2011. However, a clear improvement in accuracy was achieved by AlexNet with an error rate of 16.4%, a 10% margin with the runner-up. AlexNet consists of five convolutional, and three fully connected layers. The network contains a total of 61 million parameters. Due to this large size of the network, the original model had to be trained on two GPUs with a model parallel approach, where the filters were distributed to these GPUs. Moreover, dropout was required to avoid overfitting using such a large model size. These model sizes have millions of parameters and are not suitable for real-time applications!

SqueezeNet is a successful example which achieves AlexNet’s accuracy with 50× fewer parameters without compression and 500× smaller with deep compression. SqueezeNext is efficient because of a few design strategies: low-rank filters; a bottleneck filter to constrain the parameter count of the network; using a single fully connected layer following a bottleneck; weight and output stationary; and co-designing the network in tandem with a hardware simulator to maximize hardware usage efficiency.

The resulting SqueezeNext network is a neural network with 112X fewer model parameters than those found in AlexNet. They also develop a version of the network whose performance approaches that of VGG-19 (which did well in ImageNet 2014). The researchers also design an even more efficient network by carefully tuning model design in parallel with a hardware simulator, ultimately designing a model that is significantly faster and more energy efficient than a widely used compressed network called SqueezeNet.

 

(via: SquezeeNext)