An HMM-based synthetic view generator to improve the efficiency of ensemble systems
DATE:
2019-12-24
UNIVERSAL IDENTIFIER: http://hdl.handle.net/11093/6120
EDITED VERSION: https://academic.oup.com/jigpal/article/28/1/4/5686231
UNESCO SUBJECT: 1203.17 Informática
DOCUMENT TYPE: article
ABSTRACT
One of the most active areas of research in semi-supervised learning has been to study methods for constructing good ensembles of classifiers. Ensemble systems are techniques that create multiple models and then combine them to produce improved results. These systems usually produce more accurate solutions than a single model would. Specially, multi-view ensemble systems improve the accuracy of text classification because they optimize the functions to exploit different views of the same input data. However, despite being more promising than the single-view approaches, document datasets often have no natural multiple views available. This study proposes an algorithm to generate a synthetic view from a standard text dataset. The model generates a new view from the standard bag-of-words approach using an algorithm based on hidden Markov models (HMMs). To show the effectiveness of the proposed HMM-based synthetic view generation method, it has been integrated in a co-training ensemble system and tested with four text corpora: Reuters, 20 Newsgroup, TREC Genomics and OHSUMED. The results obtained are promising, showing a significant increase in the efficiency of the ensemble system compared to a single-view approach.
Files in this item
![pdf [PDF]](/xmlui/themes/Mirage2/images/thumbnails/mimes/pdf.png)
- Name:
- 2020_borrajo_hmm_based_synthet ...
- Size:
- 1.259Mb
- Format:
- Description:
- Manuscrito aceptado