Agritrop
Accueil

GeospatRE: Extraction and geocoding of spatial relation entities in textual documents

Syed Mehtab Alam, Arsevska Elena, Roche Mathieu, Teisseire Maguelonne. 2023. GeospatRE: Extraction and geocoding of spatial relation entities in textual documents. Cartography and Geographic Information Science, 17 p.

Article de revue ; Article de recherche ; Article de revue à facteur d'impact
[img] Version publiée - Anglais
Accès réservé aux personnels Cirad
Utilisation soumise à autorisation de l'auteur ou du Cirad.
GeospatRE extraction and geocoding of spatial relation entities in textual documents.pdf

Télécharger (13MB) | Demander une copie

Url - jeu de données - Entrepôt autre : https://figshare.com/articles/journal_contribution/GeospatRE_extraction_and_geocoding_of_spatial_relation_entities_in_textual_documents/24680054

Liste HCERES des revues (en SHS) : oui

Thème(s) HCERES des revues (en SHS) : Géographie-Aménagement-Urbanisme-Architecture

Résumé : Spatial information extraction from textual documents and its accurate geo-referencing are important steps in epidemiology, with many applications such as outbreak detection and disease surveillance and control. However, inaccuracy in extraction of such geospatial information will result into inaccurate location identification, which in consequence may produce erroneous information for outbreak investigation and disease surveillance. One of the problems is the extraction of geospatial relations associated with spatial entities in the text documents. In order to identify such geospatial relations, we categorized them into three major relations: 1) Level-1, e.g. center, north, south; 2) Level-2, e.g. nearby, border; 3) Level-3, e.g. distance from spatial entities e.g. 30 km, 20 miles, 100 m, etc., respectively. This work introduces a novel approach for extracting and georeferencing spatial information from textual documents for accurate identification of geospatial relations associated with spatial entities to enhance outbreak monitoring and disease surveillance. We propose a two-step methodology: (i) Extraction of geospatial relations associated with spatial entities, using a clause-based approach, and (ii) Geo-referencing of geospatial relations associated with spatial entities in order to identify the polygon regions, using a custom algorithm to slice or derive the geospatial relation regions from the place name and their geospatial relations. The first step is evaluated with a disease news article dataset consisting of event information and obtaining a precision of 0.9, recall of 0.88 and F-Score of 0.88 respectively. The second step entails using a qualitative evaluation of shapes by end-users. Promising results are obtained for the experiments in second step.

Mots-clés Agrovoc : surveillance épidémiologique, épidémiologie, système d'information géographique, méthodologie, encéphalite à tiques

Mots-clés libres : Natural Language Processing, Geospatial information, Geospatial relations, Geo-tagger, Geo-parser, Geo-referencing

Agences de financement européennes : European Commission

Projets sur financement : (EU) MOnitoring Outbreak events for Disease surveillance in a data science context

Auteurs et affiliations

Source : Cirad-Agritrop (https://agritrop.cirad.fr/610767/)

Voir la notice (accès réservé à Agritrop) Voir la notice (accès réservé à Agritrop)

[ Page générée et mise en cache le 2024-11-02 ]