ArcText: A Unified Text Approach to Describing Convolutional Neural Network Architectures.

Image credit: Unsplash


The superiority of Convolutional Neural Networks (CNNs) largely relies on their architectures that are usually manually crafted with extensive human expertise. Unfortunately, such kind of domain knowledge is not necessarily owned by every interested user. Data mining on existing CNNs can discover useful patterns and fundamental comments from their architectures, providing researchers with strong prior knowledge to design effective CNN architectures when they have no expertise in CNNs. There are various state-of-the-art data mining algorithms at hand, while there is only rare work on mining CNN architectures. One of the main reasons is the gap between CNN architectures and data mining algorithms. Specifically, the current CNN architecture descriptions cannot be exactly vectorized to feed to an data mining algorithm. In this paper, we propose a unified approach, named ArcText, to describing CNN architectures based on text. Particularly, four different units and an ordering method have been elaborately designed in ArcText, to uniquely describe the same CNN architecture with sufficient information. Also, the resulted description can be exactly converted back to the corresponding CNN architecture. ArcText bridges the gap between CNN architectures and data mining researchers, and has the potential to be utilized to wider scenarios..

IEEE Transactions on Artificial Intelligence
Jiancheng Lv
Jiancheng Lv
Dean and professor of Computer Science of Sichuan University

My research interests include natural language processing, computer vision, industrial intelligence, smart medicine and smart cultural creation.