Show simple item record

dc.contributor.authorDoval Mosquera, Yerai 
dc.contributor.authorVilares Ferro, Manuel 
dc.contributor.authorVilares Ferro, Jesús 
dc.date.accessioned2023-12-11T08:34:24Z
dc.date.available2023-12-11T08:34:24Z
dc.date.issued2018-12
dc.identifier.citationExpert Systems with Applications, 113, 213-222 (2018)spa
dc.identifier.issn09574174
dc.identifier.urihttp://hdl.handle.net/11093/5483
dc.description.abstractUser–generated content published on microblogging social networks constitutes a priceless source of information. However, microtexts usually deviate from the standard lexical and grammatical rules of the language, thus making its processing by traditional intelligent systems very difficult. As an answer, microtext normalization consists in transforming those non–standard microtexts into standard well–written texts as a preprocessing step, allowing traditional approaches to continue with their usual processing. Given the importance of phonetic phenomena in non–standard text formation, an essential element of the knowledge base of a normalizer would be the phonetic rules that encode these phenomena, which can be found in the so–called phonetic algorithms. In this work we experiment with a wide range of phonetic algorithms for the English language. The aim of this study is to determine the best phonetic algorithms within the context of candidate generation for microtext normalization. In other words, we intend to find those algorithms that taking as input non–standard terms to be normalized allow us to obtain as output the smallest possible sets of normalization candidates which still contain the corresponding target standard words. As it will be stated, the choice of the phonetic algorithm will depend heavily on the capabilities of the candidate selection mechanism which we usually find at the end of a microtext normalization pipeline. The faster it can make the right choices among big enough sets of candidates, the more we can sacrifice on the precision of the phonetic algorithms in favour of coverage in order to increase the overall performance of the normalization systemen
dc.description.sponsorshipAgencia Estatal de Investigación | Ref. TIN2017-85160-C2-1-Rspa
dc.description.sponsorshipAgencia Estatal de Investigación | Ref. TIN2017-85160-C2-2-Rspa
dc.description.sponsorshipMinisterio de Economía y Competitividad | Ref. FFI2014-51978-C2-1-Rspa
dc.description.sponsorshipMinisterio de Economía y Competitividad | Ref. FFI2014-51978-C2-2-Rspa
dc.description.sponsorshipXunta de Galicia | Ref. ED431D-2017/12spa
dc.description.sponsorshipXunta de Galicia | Ref. ED431B2017/01spa
dc.description.sponsorshipXunta de Galicia | Ref. ED431D R2016/046spa
dc.description.sponsorshipMinisterio de Economía y Competitividad | Ref. BES-2015-073768spa
dc.language.isoengspa
dc.publisherExpert Systems with Applicationsspa
dc.relationinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2017-85160-C2-1-R/ES/AVANCES EN NUEVOS SISTEMAS DE EXTRACCION DE RESPUESTAS CON ANALISIS SEMANTICO Y APRENDIZAJE PROFUNDO
dc.relationinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2017-85160-C2-2-R/ES/AVANCES EN NUEVOS SISTEMAS DE EXTRACCION DE RESPUESTAS CON ANALISIS SEMANTICO Y APRENDIZAJE PROFUNDO
dc.relationinfo:eu-repo/grantAgreement/MINECO//FFI2014-51978-C2-1-R/ES/TECNOLOGIAS DE LA LENGUA PARA ANALISIS DE OPINIONES EN REDES SOCIALES
dc.relationinfo:eu-repo/grantAgreement/MINECO//FFI2014-51978-C2-2-R/ES/TECNOLOGIAS DE LA LENGUA PARA ANALISIS DE OPINIONES EN REDES SOCIALES: DEL TEXTO AL MICROTEXTO
dc.rightsAttribution-NonCommercial-NoDerivs 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.titleOn the performance of phonetic algorithms in microtext normalizationen
dc.typearticlespa
dc.rights.accessRightsopenAccessspa
dc.identifier.doi10.1016/j.eswa.2018.07.016
dc.identifier.editorhttps://linkinghub.elsevier.com/retrieve/pii/S0957417418304305spa
dc.publisher.departamentoInformáticaspa
dc.publisher.grupoinvestigacionCOmputational LEarnigspa
dc.subject.unesco3304.99 Otrasspa
dc.date.updated2023-12-05T11:03:42Z
dc.computerCitationpub_title=Expert Systems with Applications|volume=113|journal_number=|start_pag=213|end_pag=222spa


Files in this item

[PDF]

    Show simple item record

    Attribution-NonCommercial-NoDerivs 4.0 International
    Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 4.0 International