Agritrop
Accueil

SNEToolkit: Spatial named entities disambiguation toolkit

Kafando Rodrique, Decoupes Rémy, Roche Mathieu, Teisseire Maguelonne. 2023. SNEToolkit: Spatial named entities disambiguation toolkit. SoftwareX, 23:101480, 11 p.

Article de revue ; Autre type d'article ; Article de revue à facteur d'impact Revue en libre accès total
[img]
Prévisualisation
Version publiée - Anglais
Sous licence Licence Creative Commons.
Kafando_et_al_SoftwareX.pdf

Télécharger (2MB) | Prévisualisation

Résumé : “Can you tell me where San Jose is located?” “Uh! Do you know that there are more than 1700 locations named San Jose in the world?” The official name of a location is often not the name with which we are familiar. Spatial named entity (SNE) disambiguation is the process of identifying and assigning precise coordinates to a place name that can be identified in a text. This task is not always straightforward, especially when the place name in question is ambiguous for various reasons. In this context, we are interested in the disambiguation of spatial named entities that can be identified in a textual document on a country level. The solution that we propose is based on a set of techniques that allow us to disambiguate the spatial entity considering the context in which it is mentioned from a certain number of characteristics that are specific to it. The solution uses as input a textual document and extricates the named entities identified therein while associating them with the correct coordinates. SNE disambiguation is designed to support the process of fast exploration of spatiotemporal data analysis, most often for event tracking. The proposed approach was tested on 1360 SNEs extracted from the GeoVirus dataset. The results show that SNEToolkit outperformed the baseline, the standard Geonames geocoder, with a recall value of 0.911 against a recall value of 0.871 for the baseline. A flexible Python package is provided for end users.

Mots-clés Agrovoc : fouille de textes, analyse de données, données spatiales

Mots-clés libres : Text Mining, Spatial named entity, Disambiguation, Geocoding, Software

Classification Agris : U10 - Informatique, mathématiques et statistiques
C30 - Documentation et information
B10 - Géographie

Champ stratégique Cirad : CTS 7 (2019-) - Hors champs stratégiques

Agences de financement européennes : European Commission

Agences de financement hors UE : Agence Nationale de la Recherche

Programme de financement européen : H2020

Projets sur financement : (EU) MOnitoring Outbreak events for Disease surveillance in a data science context, (FRA) Building epidemiological surveillance and prophylaxis with observations both near and distant

Auteurs et affiliations

  • Kafando Rodrique, INRAE (FRA)
  • Decoupes Rémy, INRAE (FRA)
  • Roche Mathieu, CIRAD-ES-UMR TETIS (FRA) ORCID: 0000-0003-3272-8568
  • Teisseire Maguelonne, INRAE (FRA) - auteur correspondant

Source : Cirad-Agritrop (https://agritrop.cirad.fr/605738/)

Voir la notice (accès réservé à Agritrop) Voir la notice (accès réservé à Agritrop)

[ Page générée et mise en cache le 2024-07-06 ]