Syed Mehtab Alam, Arsevska Elena, Roche Mathieu, Teisseire Maguelonne. 2023. GeospatRE: Extraction and geocoding of spatial relation entities in textual documents. Cartography and Geographic Information Science, 17 p.
Version publiée
- Anglais
Accès réservé aux personnels Cirad Utilisation soumise à autorisation de l'auteur ou du Cirad. GeospatRE extraction and geocoding of spatial relation entities in textual documents.pdf Télécharger (13MB) | Demander une copie |
Url - jeu de données - Entrepôt autre : https://figshare.com/articles/journal_contribution/GeospatRE_extraction_and_geocoding_of_spatial_relation_entities_in_textual_documents/24680054
Liste HCERES des revues (en SHS) : oui
Thème(s) HCERES des revues (en SHS) : Géographie-Aménagement-Urbanisme-Architecture
Résumé : Spatial information extraction from textual documents and its accurate geo-referencing are important steps in epidemiology, with many applications such as outbreak detection and disease surveillance and control. However, inaccuracy in extraction of such geospatial information will result into inaccurate location identification, which in consequence may produce erroneous information for outbreak investigation and disease surveillance. One of the problems is the extraction of geospatial relations associated with spatial entities in the text documents. In order to identify such geospatial relations, we categorized them into three major relations: 1) Level-1, e.g. center, north, south; 2) Level-2, e.g. nearby, border; 3) Level-3, e.g. distance from spatial entities e.g. 30 km, 20 miles, 100 m, etc., respectively. This work introduces a novel approach for extracting and georeferencing spatial information from textual documents for accurate identification of geospatial relations associated with spatial entities to enhance outbreak monitoring and disease surveillance. We propose a two-step methodology: (i) Extraction of geospatial relations associated with spatial entities, using a clause-based approach, and (ii) Geo-referencing of geospatial relations associated with spatial entities in order to identify the polygon regions, using a custom algorithm to slice or derive the geospatial relation regions from the place name and their geospatial relations. The first step is evaluated with a disease news article dataset consisting of event information and obtaining a precision of 0.9, recall of 0.88 and F-Score of 0.88 respectively. The second step entails using a qualitative evaluation of shapes by end-users. Promising results are obtained for the experiments in second step.
Mots-clés Agrovoc : surveillance épidémiologique, épidémiologie, système d'information géographique, méthodologie, encéphalite à tiques
Mots-clés libres : Natural Language Processing, Geospatial information, Geospatial relations, Geo-tagger, Geo-parser, Geo-referencing
Agences de financement européennes : European Commission
Projets sur financement : (EU) MOnitoring Outbreak events for Disease surveillance in a data science context
Auteurs et affiliations
- Syed Mehtab Alam, CIRAD-ES-UMR TETIS (FRA)
- Arsevska Elena, CIRAD-BIOS-UMR ASTRE (FRA) ORCID: 0000-0002-6693-2316
- Roche Mathieu, CIRAD-ES-UMR TETIS (FRA) ORCID: 0000-0003-3272-8568 - auteur correspondant
- Teisseire Maguelonne, INRAE (FRA)
Source : Cirad-Agritrop (https://agritrop.cirad.fr/610767/)
[ Page générée et mise en cache le 2024-11-02 ]