MobyDeep: A lightweight CNN architecture to configure models for text classification
DATE:
2022-12
UNIVERSAL IDENTIFIER: http://hdl.handle.net/11093/4094
EDITED VERSION: https://linkinghub.elsevier.com/retrieve/pii/S0950705122010073
DOCUMENT TYPE: article
ABSTRACT
Nowadays, trends in deep learning for text classification are addressed to create complex models
to deal with huge datasets. Deeper models are usually based on cutting edge neural network
architectures, achieving good results in general but demanding better hardware than shallow ones. In
this work, a new Convolutional Neural Network (CNN) architecture (MobyDeep) for text classification
tasks is proposed. Designed as a configurable tool, resultant models (MobyNets) are able to manage
big corpora sizes under low computational costs. To achieve those milestones, the architecture was
conceived to produce lightweight models, having their internal layers based on a new proposed
convolutional block. That block was designed and customized by adapting ideas from image to text
processing, helping to squeezing model sizes and to reduce computational costs. The architecture was
also designed as a residual network, covering complex functions by extending models up to 28 layers.
Moreover, middle layers were optimized by residual connections, helping to remove fully connected
layers on top and resulting in Fully CNN. Corpus were chosen from the recent literature, aiming to
define real scenarios when comparing configured MobyDeep models with other state-of the-art works.
Thus, three models were configured in 8, 16 and 28 layers respectively, offering competitive accuracy
results.