MobyDeep: A lightweight CNN architecture to configure models for text classification
UNIVERSAL IDENTIFIER: http://hdl.handle.net/11093/4094
EDITED VERSION: https://linkinghub.elsevier.com/retrieve/pii/S0950705122010073
UNESCO SUBJECT: 5701.02 Documentación Automatizada ; 1203.04 Inteligencia Artificial
DOCUMENT TYPE: article
Nowadays, trends in deep learning for text classification are addressed to create complex models to deal with huge datasets. Deeper models are usually based on cutting edge neural network architectures, achieving good results in general but demanding better hardware than shallow ones. In this work, a new Convolutional Neural Network (CNN) architecture (MobyDeep) for text classification tasks is proposed. Designed as a configurable tool, resultant models (MobyNets) are able to manage big corpora sizes under low computational costs. To achieve those milestones, the architecture was conceived to produce lightweight models, having their internal layers based on a new proposed convolutional block. That block was designed and customized by adapting ideas from image to text processing, helping to squeezing model sizes and to reduce computational costs. The architecture was also designed as a residual network, covering complex functions by extending models up to 28 layers. Moreover, middle layers were optimized by residual connections, helping to remove fully connected layers on top and resulting in Fully CNN. Corpus were chosen from the recent literature, aiming to define real scenarios when comparing configured MobyDeep models with other state-of the-art works. Thus, three models were configured in 8, 16 and 28 layers respectively, offering competitive accuracy results.
Files in this item