Integration of lexical and semantic knowledge for sentiment analysis in SMS

Khiari Wejdene, Bouhafs Hafsia Asma, Roche Mathieu. 2016. Integration of lexical and semantic knowledge for sentiment analysis in SMS. In : LREC 2016 Proceedings. Calzolari Nicoletta (ed.), Choukri Khalid (ed.), Declerck Thierry (ed.), Goggi Sara (ed.), Grobelnik Marko (ed.), Maegaard Bente (ed.), Mariani Joseph (ed.) , Mazo Hélène (ed.), Moreno Asuncion (ed.), Odijk Jan (ed.), Piperidis Stelios (ed.). ELRA, ELDA, ILC. Portoroz : ELRA, pp. 1185-1189. ISBN 978-2-9517408-9-1 International Conference on Language Resources and Evaluation (LREC 2016). 10, Portoroz, Slovénie, 23 May 2016/28 May 2016.

Paper with proceedings
Published version - Anglais
Use under authorization by the author or CIRAD.

Télécharger (171kB) | Preview

Url - jeu de données :

Abstract : With the explosive growth of online social media (forums, blogs, and social networks), exploitation of these new information sources has become essential. Our work is based on the sud4science project. The goal of this project is to perform multidisciplinary work on a corpus of authentic SMS, in French, collected in 2011 and anonymised (88milSMS corpus: This paper highlights a new method to integrate opinion detection knowledge from an SMS corpus by combining lexical and semantic information. More precisely, our approach gives more weight to words with a sentiment (i.e. presence of words in a dedicated dictionary) for a classification task based on three classes: positive, negative, and neutral. The experiments were conducted on two corpora: an elongated SMS corpus (i.e. repetitions of characters in messages) and a non-elongated SMS corpus. We noted that non-elongated SMS were much better classified than elongated SMS. Overall, this study highlighted that the integration of semantic knowledge always improves classification. (Résumé d'auteur)

Mots-clés libres : Text mining, Sentiment analysis, Natural language processing, SMS

Classification Agris : C30 - Documentation and information
U10 - Computer science, mathematics and statistics
U30 - Research methods

Auteurs et affiliations

  • Khiari Wejdene, ESC (TUN)
  • Bouhafs Hafsia Asma, Université de Carthage (TUN)
  • Roche Mathieu, CIRAD-ES-UMR TETIS (FRA) ORCID: 0000-0003-3272-8568

Source : Cirad-Agritrop (

View Item (staff only) View Item (staff only)

[ Page générée et mise en cache le 2021-01-08 ]