Agritrop
Accueil

A data-driven score model to assess online news articles in event-based surveillance system

Syed Mehtab Alam, Arsevska Elena, Roche Mathieu, Teisseire Maguelonne. 2022. A data-driven score model to assess online news articles in event-based surveillance system. In : Information management and big data. Communications in Computer and Information Science book series (CCIS, volume 1577), Springer. Lossio-Ventura Juan Antonio (ed.), Valverde-Rebaza Jorge (ed.), Díaz Eduardo (ed.), Muñante Denisse (ed.), Gavidia-Calderon Carlos (ed.), Demétrius Baria Valejo Alan (ed.), Alatrista-Salas Hugo (ed.). Cham : Springer, 264-280. (Communication in Computer and Information Science (CCIS), 1577) ISBN 978-3-031-04446-5 Annual International Conference on Information Management and Big Data (SIMBig 2021). 8, 1 Décembre 2021/3 Décembre 2021.

Communication avec actes
[img] Version publiée - Anglais
Accès réservé aux personnels Cirad
Utilisation soumise à autorisation de l'auteur ou du Cirad.
ID600948.pdf

Télécharger (5MB) | Demander une copie
[img] Version Online first - Anglais
Accès réservé aux personnels Cirad
Utilisation soumise à autorisation de l'auteur ou du Cirad.
Mehtab_Alam_Syed_et_al_SIMBIG2021.pdf

Télécharger (1MB) | Demander une copie

Résumé : Online news sources are popular resources for learning about current health situations and developing event-based surveillance (EBS) systems. However, having access to diverse information originating from multiple sources can misinform stakeholders, eventually leading to false health risks. The existing literature contains several techniques for performing data quality evaluation to minimize the effects of misleading information. However, these methods only rely on the extraction of spatiotemporal information for representing health events. To address this research gap, a score-based technique is proposed to quantify the data quality of online news articles through three assessment measures: 1) news article metadata, 2) content analysis, and 3) epidemiological entity extraction with NLP to weight the contextual information. The results are calculated using classification metrics with two evaluation approaches: 1) a strict approach and 2) a flexible approach. The obtained results show significant enhancement in the data quality by filtering irrelevant news, which can potentially reduce false alert generation in EBS systems.

Mots-clés libres : Text Mining, Natural language processing, Data quality, Event-based surveillance, Epidemic intelligence, Avian-influenza

Agences de financement européennes : European Commission

Projets sur financement : (EU) MOnitoring Outbreak events for Disease surveillance in a data science context

Auteurs et affiliations

Autres liens de la publication

Source : Cirad-Agritrop (https://agritrop.cirad.fr/600948/)

Voir la notice (accès réservé à Agritrop) Voir la notice (accès réservé à Agritrop)

[ Page générée et mise en cache le 2024-12-19 ]