Lopez Cédric, Zenasni Sarah, Kergosien Eric, Partalas Ioannis, Roche Mathieu, Teisseire Maguelonne, Panckhurst Rachel.
2018. Extracting absolute spatial entities from SMS: Comparing a supervised and an unsupervised approach.
In : Language and the new (instant) media. Cougnon Louise-Amélie (ed.), De Cock Barbara (ed.), Fairon Cédrick (ed.)
Version publiée
- Anglais
Accès réservé aux personnels Cirad Utilisation soumise à autorisation de l'auteur ou du Cirad. ID588683.pdf Télécharger (524kB) | Demander une copie |
Url - jeu de données - Dataverse Cirad : https://doi.org/10.18167/DVN1/0ZGJRC
Résumé : More than one hundred thousand SMS messages are sent worldwide every second, and each SMS message is likely to contain lexical creativity. Recently, SMS content has been recognised to be of notable interest in many domains, such as e-commerce or psychiatry and more generally Health Informatics. But the automatic analysis of such data is difficult, particularly when dealing with information extraction. In this study, we will focus on “spatial entity recognition”, which consists of recognising countries, cities, places, bars, restaurants, cinemas, beaches, and so forth. For instance, Montpel, mtpl, mtp, and motpeliè all stand for the city of Montpellier. We will compare two different ways of tackling new forms of spatial entity recognition in SMS.
Mots-clés libres : Text mining, Spatial entities, Machine learning methods, Natural language processing
Classification Agris : C30 - Documentation et information
U10 - Informatique, mathématiques et statistiques
C10 - Enseignement
Champ stratégique Cirad : Hors axes (2014-2018)
Auteurs et affiliations
- Lopez Cédric, VISEO (FRA)
- Zenasni Sarah, CIRAD-ES-UMR TETIS (FRA)
- Kergosien Eric, Université de Lille (FRA)
- Partalas Ioannis, VISEO (FRA)
- Roche Mathieu, CIRAD-ES-UMR TETIS (FRA) ORCID: 0000-0003-3272-8568
- Teisseire Maguelonne, LIRMM (FRA)
- Panckhurst Rachel, CNRS (FRA)
Autres liens de la publication
Source : Cirad-Agritrop (https://agritrop.cirad.fr/588683/)
[ Page générée et mise en cache le 2024-04-04 ]