SuperConv Strengthening the Convolution Kernel via Weight Sharing.

Image credit: Unsplash

Abstract

For the current neural network models, in order to improve the accuracy of the models, we need efficient plug-and-play modules. Therefore, many efficient plug-and-play operations are proposed, such as Asymmetric Convolution Block (ACB). However, the introduction of multi-branch convolution kernels in ACB increases the trainable parameters, which is an extra burden to the training of large models. In this work, SuperConv is proposed to reduce the trainable parameters while maintaining the advantages of ACB. SuperConv utilizes the method in single-path NAS to encode the convolution kernels of different sizes in multiple branches into a super-kernel, so that the convolution kernels can share some weights with each other. In addition, we introduce SuperConv into MixConv and propose SuperMixConv (SP-MixConv). To verify the effectiveness of SP-MixConv, ACB, MixConv and SP-MixConv are inserted into the Cifar-quick model and the model with SP-MixConv gets the best accuracy on CIFAR-10 and CIFAR-100. And SuperConv and SP-MixConv will not add extra burden in inference. Simultaneously, SuperConv is very easy to implement, using existing tools such as Pytorch, and is also an interesting attempt for the design of efficient plug-and-play convolution block.

Chuan Liu
Chuan Liu
M.Sc Student

My research interests include CV, NAS and Deep Learning

Jiancheng Lv
Jiancheng Lv
Dean and professor of Computer Science of Sichuan University

My research interests include natural language processing, computer vision, industrial intelligence, smart medicine and smart cultural creation.

Related