Agritrop
Accueil

New approach to discover meaningful terms to specify cause of death from narratives verbal autopsy using TF-IDF and the LDA topic model

Diouf Mansour, Thiam Mohammadou, Roche Mathieu. 2023. New approach to discover meaningful terms to specify cause of death from narratives verbal autopsy using TF-IDF and the LDA topic model. In : IEEE EUROCON 2023 - 20th International Conference on Smart Technologies. IEEE. New York : IEEE ISBN 978-1-6654-6397-3 International Conference on Smart Technologies (IEEE EUROCON 2023). 20, Torino, Italie, 6 Juillet 2023/8 Juillet 2023.

Communication avec actes
[img] Version publiée - Anglais
Accès réservé aux personnels Cirad
Utilisation soumise à autorisation de l'auteur ou du Cirad.
Diouf_et_al_ EUROCON2023.pdf

Télécharger (271kB) | Demander une copie

Résumé : Due to a lack of coroners in some remote areas of the world, epidemiological researchers have created a database for collecting causes of death, called a verbal autopsy. The unstructured verbal autopsy (VA) narratives that are collected in this database are full of hidden knowledge about mortality. However, they are under-exploited due to inadequate processing mechanisms, or some of the computational techniques used are inappropriate for the data format. In this paper, we propose an unsupervised approach that is essentially based on a new algorithm for preprocessing such data. This is not only to address the challenge of topic extraction with the Latent Dirichlet Allocation (LDA) topic model in the context of data scarcity, but also to improve the exploitation of topics (causes of death). Experiments with the Population Health Metrics Research Consortium (PHMRC) data have demonstrated the validity of the approach and have led to the identification of reliable causes of death as well as the discovery of new ones.

Mots-clés libres : Text Mining, Natural Language Processing, Topic Modeling

Auteurs et affiliations

  • Diouf Mansour, Université de Thiès (SEN)
  • Thiam Mohammadou, Université de Thiès (SEN)
  • Roche Mathieu, CIRAD-ES-UMR TETIS (FRA) ORCID: 0000-0003-3272-8568

Source : Cirad-Agritrop (https://agritrop.cirad.fr/610870/)

Voir la notice (accès réservé à Agritrop) Voir la notice (accès réservé à Agritrop)

[ Page générée et mise en cache le 2024-11-04 ]