LIP6 CNRS Sorbonne Université Tremplin Carnot Interfaces
Direct Link LIP6 » News » PhD students

RAFRAFI Abdelhalim

PhD graduated
Team : MLIA
Departure date : 12/31/2013
Supervision : Patrick GALLINARI
Co-supervision : GUIGUE Vincent

Sentiment classification on the Web 2.0

Internet becomes an essential media in everyday life: we use it to check the news, to do our shopping, to shape our opinion, to share our feelings and experience feedbacks. This process generates a large amount of data on our personalities and lifestyles. With this amount of information we are quickly disarmed. textit{"Looks like the overload of information gives a sense of emptiness." French quotation by Jean-Pierre April}. Thus, some automated filtering and analyzing tools are required to make the information accessible to everybody. In this general context, our works focuses on sentiment analysis and on sentiment classification in particular.
Classical algorithms for text categorization like SVM, NB, PLSA or LDA show several limitations for sentiment analysis. These limitations are related to the particularity of the task: sentiment classification requires to take into account the structure of the text (including negations for instance), the lexical field modeling is not sufficient to understand the user messages. However, considering the text structure requires some complex representations and/or algorithms that can hardly scale up. The second challenge consists in optimizing classifiers in large functional space (to describe sentiments efficiently) and preserving generality in the meantime. Indeed, we would like to be able to deal with documents from various topics gathered from different media (Twitter, blogs, reviews...).
We investigated many solutions to tackle those antagonist objectives simultaneously. First we focused on regularized formulations adapted to sentiment classification to perform an efficient feature selection in N-grams space. Then, we explored an orthogonal research axis: given a basic classifier, we simply increased the learning set sizes using the web2.0 as an infinite source of labeled data. Finally, we tried to combine the advantages from both solutions using an original neural network architecture.
Defence : 12/20/2013 - 10h - Site Jussieu 25-26/105
Jury members :
Tellier Isabelle - Université Paris 3 [Rapportrice]
Paroubek Patrick - Université Paris Sud [Rapporteur]
Gallinari Patrick - Université Paris 6
Guigue Vincent - Université Paris 6
Gouttas Catherine - Thales Communications&Security
Bennani Younes - Université Paris 13
Marsala Christophe Université Paris 6

2011-2013 Publications

 Mentions légales
Site map |