Valentin Sarah, Lancelot Renaud, Roche Mathieu. 2021. Identifying associations between epidemiological entities in news data for animal disease surveillance. Artificial Intelligence in Agriculture, 5 : 163-174.
|
Version publiée
- Anglais
Sous licence . 1-s2.0-S2589721721000246-main.pdf Télécharger (1MB) | Prévisualisation |
Résumé : Event-based surveillance systems are at the crossroads of human and animal (and plant and ecosystem) health, epidemiology, statistics, and informatics. Thus, their deployment faces many challenges specific to each domain and their intersections, such as relations among automation, artificial intelligence, and expertise. In this context, our work pertins to the extraction of epidemiological events in textual data (i.e. news) by unsupervised methods. We define the event extraction task as detecting pairs of epidemiological entities (e.g. a disease name and location). The quality of the ranked lists of pairs was evaluated using specific ranking evaluation metrics. We used a publicly available annotated corpus of 438 documents (i.e. news articles) related to animal disease events. The statistical approach was able to detect event-related pairs of epidemiological features with a good trade-off between precision and recall. Our results showed that using a window of words outperformed document-based and sentence-based approaches, while reducing the probability of detecting false pairs. Our results indicated that Mutual Information was less adapted than the Dice coefficient for ranking pairs of features in the event extraction framework. We believe that Mutual Information would be more relevant for rare pair detection (i.e. weak signals), but requires higher manual curation to avoid false positive extraction pairs. Moreover, generalising the country-level spatial features enabled better discrimination (i.e. ranking) of relevant disease-location pairs for event extraction.
Mots-clés Agrovoc : épidémiologie, surveillance épidémiologique, fouille de textes, maladie des animaux, santé animale, analyse de données, données spatiales
Mots-clés complémentaires : One Health, données textuelles
Mots-clés libres : Animal disease surveillance, Text Mining, Event extraction, Epidemic intelligence, One Health
Classification Agris : L73 - Maladies des animaux
U10 - Informatique, mathématiques et statistiques
C30 - Documentation et information
Champ stratégique Cirad : CTS 4 (2019-) - Santé des plantes, des animaux et des écosystèmes
Auteurs et affiliations
- Valentin Sarah, CIRAD-BIOS-UMR ASTRE (FRA) ORCID: 0000-0002-9028-681X
- Lancelot Renaud, CIRAD-BIOS-UMR ASTRE (FRA) ORCID: 0000-0002-5826-5242
- Roche Mathieu, CIRAD-ES-UMR TETIS (FRA) ORCID: 0000-0003-3272-8568 - auteur correspondant
Source : Cirad-Agritrop (https://agritrop.cirad.fr/598968/)
[ Page générée et mise en cache le 2024-12-13 ]