RT Journal Article T1 A novel gluten knowledge base of potential biomedical and health-related interactions extracted from the literature: using machine learning and graph analysis methodologies to reconstruct the bibliome A1 Perez Perez, Martín A1 Ferreira, Tânia A1 Igrejas, Gilberto A1 Fernández Riverola, Florentino K1 3304.99 Otras K1 3206.10 Enfermedades de la Nutrición K1 1203.17 Informática AB BackgroundIn return for their nutritional properties and broad availability, cereal crops have been associated with different alimentary disorders and symptoms, with the majority of the responsibility being attributed to gluten. Therefore, the research of gluten-related literature data continues to be produced at ever-growing rates, driven in part by the recent exploratory studies that link gluten to non-traditional diseases and the popularity of gluten-free diets, making it increasingly difficult to access and analyse practical and structured information. In this sense, the accelerated discovery of novel advances in diagnosis and treatment, as well as exploratory studies, produce a favourable scenario for disinformation and misinformation.ObjectivesAligned with, the European Union strategy “Delivering on EU Food Safety and Nutrition in 2050″ which emphasizes the inextricable links between imbalanced diets, the increased exposure to unreliable sources of information and misleading information, and the increased dependency on reliable sources of information; this paper presents GlutKNOIS, a public and interactive literature-based database that reconstructs and represents the experimental biomedical knowledge extracted from the gluten-related literature. The developed platform includes different external database knowledge, bibliometrics statistics and social media discussion to propose a novel and enhanced way to search, visualise and analyse potential biomedical and health-related interactions in relation to the gluten domain.MethodsFor this purpose, the presented study applies a semi-supervised curation workflow that combines natural language processing techniques, machine learning algorithms, ontology-based normalization and integration approaches, named entity recognition methods, and graph knowledge reconstruction methodologies to process, classify, represent and analyse the experimental findings contained in the literature, which is also complemented by data from the social discussion.Results and conclusionsIn this sense, 5814 documents were manually annotated and 7424 were fully automatically processed to reconstruct the first online gluten-related knowledge database of evidenced health-related interactions that produce health or metabolic changes based on the literature. In addition, the automatic processing of the literature combined with the knowledge representation methodologies proposed has the potential to assist in the revision and analysis of years of gluten research. The reconstructed knowledge base is public and accessible at https://sing-group.org/glutknois/ PB Journal of Biomedical Informatics SN 15320464 YR 2023 FD 2023-07 LK http://hdl.handle.net/11093/4898 UL http://hdl.handle.net/11093/4898 LA eng NO Journal of Biomedical Informatics, 143, 104398 (2023) NO Fundação para a Ciência e a Tecnologia | Ref. UIDB/50006/2020 DS Investigo RD 14-oct-2024