RT Journal Article T1 Classification of full text biomedical documents: sections importance assessment A1 Gonçalves, Carlos Adriano de Oliveira A1 Sousa Ferreira da Silva, Rui Carlos Camacho de A1 Gonçalves, Célia Talma A1 Seara Vieira, Adrián A1 Borrajo Diz, Maria Lourdes A1 Lorenzo Iglesias, Eva Maria K1 1203.04 Inteligencia Artificial K1 5701.02 Documentación Automatizada K1 5701.04 Lingüística Informatizada AB The exponential growth of documents in the web makes it very hard for researchers to be aware of the relevant work being done within the scientific community. The task of efficiently retrieving information has therefore become an important research topic. The objective of this study is to test how the efficiency of the text classification changes if different weights are previously assigned to the sections that compose the documents. The proposal takes into account the place (section) where terms are located in the document, and each section has a weight that can be modified depending on the corpus. To carry out the study, an extended version of the OHSUMED corpus with full documents have been created. Through the use of WEKA, we compared the use of abstracts only with that of full texts, as well as the use of section weighing combinations to assess their significance in the scientific article classification process using the SMO (Sequential Minimal Optimization), the WEKA Support Vector Machine (SVM) algorithm implementation. The experimental results show that the proposed combinations of the preprocessing techniques and feature selection achieve promising results for the task of full text scientific document classification. We also have evidence to conclude that enriched datasets with text from certain sections achieve better results than using only titles and abstracts. PB Applied Sciences SN 20763417 YR 2021 FD 2021-03-17 LK http://hdl.handle.net/11093/2058 UL http://hdl.handle.net/11093/2058 LA eng NO Applied Sciences, 11(6): 2674 (2021) NO Xunta de Galicia | Ref. ED431C2018 / 55 DS Investigo RD 19-may-2025