Project OSCAR

Opinion Stream Classification with Ensembles and Active leaRners

How to learn sentiment in a changing environment with limited user feedback

Funding: DFG 2017-2019

Motivation

“What other people think” has always been an important piece of information for our decision-making process. But the Internet and the Web allow us now to find answers to this question beyond the circle of our personal acquaintances. Traditional sentiment mining techniques focus on static data. However, as opinions accumulate from the social streams, changes might occur like changes in the general sentiment towards a subject or towards specific facets of this subject, as well as changes in the words used to express sentiment. Subjects also change over time. In OSCAR, we develop opinion stream mining methods that deal with change and adapt the learned models continuously.

Challenges & Highlights

The first part of OSCAR is on leveraging stream mining methods to deal with vocabulary/ feature changes. A change in the feature space means that the model built upon the old words must be updated. We will accumulate information on the usage and sentiment of each word to highlight the long-term interplay between word polarity and document polarity. Second, we will work on reducing the need for labeled documents. To this end we will develop active learning methods that learn and adapt polarity models on an evolving feature space. Third, we will work on dealing with different types of change simultaneously. To this purpose, we will use ensembles. We will dedicate some ensemble members to the identification of topic trends, others to changes in the vocabulary and others to temporal changes, including periodical ones.

Potential applications & future issues

The output of OSCAR will be a complete framework, encompassing active ensemble learning methods that deal with different forms of change and learn with limited expert involvement. Such a framework can be used in other stream classification tasks, beyond sentiment analysis.

(core) Team

Leibniz University Hannover & L3S Research Center

  • Prof. Dr. Eirini Ntoutsi
  • Damianos Melidis

Otto von Guericke University Magdeburg

  • Prof. Dr. Myra Spiliopoulou
  • Vishnu Unnikrishnan

Related publications

  • D. Melidis, M. Spiliopoulou, E. Ntoutsi, "Learning under Feature Drifts in Textual Streams", CIKM, 2018 (accepted).
  • C. Blake, E. Ntoutsi, "Reinforcement Learning Based Decision Tree Induction over Data Streams with Concept Drifts", IEEE ICBK, 2018 (accepted).
  • V. Unnikrishnan, C. Beyer, U. Niemann, P. Matuszyk, R. Pryss, W. Schlee, E. Ntoutsi and M. Spiliopoulou, "Entity-Level Stream Classification: Exploiting Entity Similarity to Label the Next Observation Referring to an Entity", IEEE DSAA, 2018.
  • D. Melidis, A. Veizaga Campero, V. Iosifidis, E. Ntoutsi, M. Spiliopoulou, "Enriching Lexicons with Ephemeral Words for Sentiment Analysis in Social Streams", WIMS, Novi Sad, Serbia, 2018 (accepted).
  • C. Beyer, U. Niemann, V. Unnikrishnan, E. Ntoutsi, M. Spiliopoulou, "Predicting Document Polarities on a Stream without Reading their Contents", SAC Data Streams track, Pau, France, 2018 (accepted).
  • V. Iosifidis, E. Ntoutsi, "Large scale sentiment annotation with limited labels", KDD, Halifax, Canada, 2017. [local copy]
  • V. Iosifidis, E. Ntoutsi, "Sentiment Classification over Opinionated Data Streams through Informed Model Adaptation", TPDL, Thessaloniki, Greece, 2017. [local copy]