Interpretable classification of Wiki-review streams
FECHA:
2023-12-13
IDENTIFICADOR UNIVERSAL: http://hdl.handle.net/11093/6643
VERSIÓN EDITADA: https://ieeexplore.ieee.org/document/10356073/
MATERIA UNESCO: 3325 Tecnología de las Telecomunicaciones ; 6308 Comunicaciones Sociales ; 1203.17 Informática
TIPO DE DOCUMENTO: article
RESUMEN
Wiki articles are created and maintained by a crowd of editors, producing a continuous stream
of reviews. Reviews can take the form of additions, reverts, or both. This crowdsourcing model is exposed
to manipulation since neither reviews nor editors are automatically screened and purged. To protect articles
against vandalism or damage, the stream of reviews can be mined to classify reviews and profle editors in
real-time. The goal of this work is to anticipate and explain which reviews to revert. This way, editors are
informed why their edits will be reverted. The proposed method employs stream-based processing, updating
the profling and classifcation models on each incoming event. The profling uses side and content-based
features employing Natural Language Processing, and editor profles are incrementally updated based on
their reviews. Since the proposed method relies on self-explainable classifcation algorithms, it is possible
to understand why a review has been classifed as a revert or a non-revert. In addition, this work contributes
an algorithm for generating synthetic data for class balancing, making the fnal classifcation fairer. The
proposed online method was tested with a real data set from Wikivoyage, which was balanced through the
aforementioned synthetic data generation. The results attained near-90 % values for all evaluation metrics
(accuracy, precision, recall, and F-measure)