Agritrop
Accueil

Spatial information extraction from short messages

Zenasni Sarah, Kergosien Eric, Roche Mathieu, Teisseire Maguelonne. 2018. Spatial information extraction from short messages. Expert Systems with Applications, 95 : 351-367.

Article de revue ; Article de recherche ; Article de revue à facteur d'impact
[img] Version publiée - Anglais
Accès réservé aux personnels Cirad
Utilisation soumise à autorisation de l'auteur ou du Cirad.
ESWA_ZENASNI_2018.pdf

Télécharger (1MB) | Demander une copie

Url - jeu de données - Dataverse Cirad : https://doi.org/10.18167/DVN1/0ZGJRC / Url - jeu de données - Dataverse Cirad : https://doi.org/10.18167/DVN1/LPY080

Quartile : Q1, Sujet : OPERATIONS RESEARCH & MANAGEMENT SCIENCE / Quartile : Q1, Sujet : COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE / Quartile : Q1, Sujet : ENGINEERING, ELECTRICAL & ELECTRONIC

Liste HCERES des revues (en SHS) : oui

Thème(s) HCERES des revues (en SHS) : Economie-gestion

Résumé : Texts in addition to maps and satellite images, have become an important spatial data resource in recent years. Electronic written texts used in mediated interactions, especially short messages, have triggered the emergence of new ways of writing. Extracting information from such short messages, which represent a rich source of information, is highly important in order to discover domain-relevant information in the text and facilitate information retrieval. However, short messages are hard to analyse because of their brief, unstructured and informal nature. This paper focuses on the kinds of special or unique spatial entities and relations are contained in short messages. A new entity extraction method specifically dedicated to French short messages (SMS and tweets) is outlined to address this issue. The method is then tested on more traditional sources, like newspaper texts. This work is crucial in order to take advantage of the vast amount of geographical knowledge expressed in heterogeneous unstructured data. Firstly, we propose a process in which new spatial entities are extracted (e.g. motpellier, montpelier, Montpel are associated with Montpellier). Secondly, we identify new spatial relations that precede spatial entities (e.g. sur, par). Finally, we propose general patterns for the extraction of spatial relations. The task is very challenging and complex due to the specificity of short message language, which is based on weakly standardized modes of writing. The experiments were carried out on the three French corpora (i.e. 88milSMS, tweets, and Midi Libre) and highlight the efficiency of our proposal for identifying new kinds of spatial entities and relations.

Mots-clés Agrovoc : cartographie, données spatiales

Mots-clés libres : Text mining, Spatial entities, Spatial relations, Similarity measures, Short messages

Classification Agris : C30 - Documentation et information
U30 - Méthodes de recherche
000 - Autres thèmes

Champ stratégique Cirad : Hors axes (2014-2018)

Auteurs et affiliations

  • Zenasni Sarah, CIRAD-ES-UMR TETIS (FRA) - auteur correspondant
  • Kergosien Eric, Université de Lille (FRA)
  • Roche Mathieu, CIRAD-ES-UMR TETIS (FRA) ORCID: 0000-0003-3272-8568
  • Teisseire Maguelonne, IRSTEA (FRA)

Source : Cirad-Agritrop (https://agritrop.cirad.fr/586144/)

Voir la notice (accès réservé à Agritrop) Voir la notice (accès réservé à Agritrop)

[ Page générée et mise en cache le 2024-02-22 ]