Khiari Wejdene, Bouhafs Hafsia Asma, Roche Mathieu.
2016. Integration of lexical and semantic knowledge for sentiment analysis in SMS.
In : LREC 2016 Proceedings. Calzolari Nicoletta (ed.), Choukri Khalid (ed.), Declerck Thierry (ed.), Goggi Sara (ed.), Grobelnik Marko (ed.), Maegaard Bente (ed.), Mariani Joseph (ed.) , Mazo Hélène (ed.), Moreno Asuncion (ed.), Odijk Jan (ed.), Piperidis Stelios (ed.). ELRA, ELDA, ILC
|
Version publiée
- Anglais
Utilisation soumise à autorisation de l'auteur ou du Cirad. LREC16_1_free.pdf Télécharger (171kB) | Prévisualisation |
Url - jeu de données - Entrepôt autre : http://88milsms.huma-num.fr
Résumé : With the explosive growth of online social media (forums, blogs, and social networks), exploitation of these new information sources has become essential. Our work is based on the sud4science project. The goal of this project is to perform multidisciplinary work on a corpus of authentic SMS, in French, collected in 2011 and anonymised (88milSMS corpus: http://88milsms.huma-num.fr). This paper highlights a new method to integrate opinion detection knowledge from an SMS corpus by combining lexical and semantic information. More precisely, our approach gives more weight to words with a sentiment (i.e. presence of words in a dedicated dictionary) for a classification task based on three classes: positive, negative, and neutral. The experiments were conducted on two corpora: an elongated SMS corpus (i.e. repetitions of characters in messages) and a non-elongated SMS corpus. We noted that non-elongated SMS were much better classified than elongated SMS. Overall, this study highlighted that the integration of semantic knowledge always improves classification.
Mots-clés libres : Text mining, Sentiment analysis, Natural language processing, SMS
Classification Agris : C30 - Documentation et information
U10 - Informatique, mathématiques et statistiques
U30 - Méthodes de recherche
Auteurs et affiliations
- Khiari Wejdene, ESC (TUN)
- Bouhafs Hafsia Asma, Université de Carthage (TUN)
- Roche Mathieu, CIRAD-ES-UMR TETIS (FRA) ORCID: 0000-0003-3272-8568
Source : Cirad-Agritrop (https://agritrop.cirad.fr/580889/)
[ Page générée et mise en cache le 2022-09-13 ]