Project OSCAR

Opinion Stream Classification with Ensembles and Active leaRners

How to learn sentiment in a changing environment with limited user feedback

Funding: DFG 2017-2019

Motivation

“What other people think” has always been an important piece of information for our decision-making process. But the Internet and the Web allow us now to find answers to this question beyond the circle of our personal acquaintances. Traditional sentiment mining techniques focus on static data. However, as opinions accumulate from the social streams, changes might occur like changes in the general sentiment towards a subject or towards specific facets of this subject, as well as changes in the words used to express sentiment. Subjects also change over time. In OSCAR, we develop opinion stream mining methods that deal with change and adapt the learned models continuously.

Challenges & Highlights

The first part of OSCAR is on leveraging stream mining methods to deal with vocabulary/ feature changes. A change in the feature space means that the model built upon the old words must be updated. We will accumulate information on the usage and sentiment of each word to highlight the long-term interplay between word polarity and document polarity. Second, we will work on reducing the need for labeled documents. To this end we will develop active learning methods that learn and adapt polarity models on an evolving feature space. Third, we will work on dealing with different types of change simultaneously. To this purpose, we will use ensembles. We will dedicate some ensemble members to the identification of topic trends, others to changes in the vocabulary and others to temporal changes, including periodical ones.

Potential applications & future issues

The output of OSCAR will be a complete framework, encompassing active ensemble learning methods that deal with different forms of change and learn with limited expert involvement. Such a framework can be used in other stream classification tasks, beyond sentiment analysis.

(core) Team

Leibniz University Hannover & L3S Research Center

  • Prof. Dr. Eirini Ntoutsi
  • M.Sc. Damianos Melidis
  • M.Sc. Amir Abolfazli
  • M.Sc. Vasileios Iosifidis

Otto von Guericke University Magdeburg

Related publications

  • V. Iosifidis, E. Ntoutsi, "AdaFair: Cumulative fairness adaptive boosting", In Proc. of the 28th ACM Int. Conf. on Information and Knowledge Management (CIKM), 2019, Beijing, China. [local copy][bib]
  • V. Iosifidis, E. Ntoutsi, "Sentiment Analysis on Big Sparse Data Streams with Limited Labels", Knowledge and Information Systems (KAIS) journal, 2019. [local copy][bib]
  • T. Le Quy, W. Nejdl, M. Spiliopoulou, E. Ntoutsi, "A Neighborhood-augmented LSTM Model forTaxi-Passenger Demand Prediction", Workshop on Multiple-aspect Analysis of Semantic Trajectories (MASTER 2019), co-located with ECML PKDD 2019 [local copy]
  • V. Unnikrishnan, C. Beyer, U. Niemann, P. Matuszyk, R. Pryss, W. Schlee, E. Ntoutsi, M. Spiliopoulou, "Entity-level stream classification: exploiting entity similarity to label the future observations referring to an entity", International Journal of Data Science and Analytics, 1-15, 2019. [local copy][bib]
  • C. Beyer, V. Unnikrishnan,U. Niemann, P. Matuszyk, E. Ntoutsi and M. Spiliopoulou, "Exploiting Entity Information for Stream Classification over a Stream of Reviews", ACM SAC Data Streams track, 2019.
  • V. Iosifidis, T. N. Han Tran, E. Ntoutsi, "Fairness-enhancing interventions in stream classification", In Proc. of the 30th Int. Conf. on Database and Expert Systems Applications (DEXA), 2019, Linz, Austria. [local copy][bib]
  • P. Fafalios, V. Iosifidis, K. Stefanidis, E. Ntoutsi, "Tracking the History and Evolution of Entities: Entity-centric Temporal Analysis of Large Social Media Archives.", Springer International Journal on Digital Libraries (IJDL), 1-13, 2018. [local copy] [bib]
  • P. Fafalios, V. Iosifidis, E. Ntoutsi, Stefan Dietze "TweetsKB: A public and large-scale RDF corpus of annotated tweets.", In Proc. of the 2018 European Semantic Web Conference (ESWC), 177-190. Springer, 2018. [local copy] [bib]
  • D. Melidis, M. Spiliopoulou, E. Ntoutsi, "Learning under Feature Drifts in Textual Streams", CIKM, 2018. [local copy][bib]
  • C. Blake, E. Ntoutsi, "Reinforcement Learning Based Decision Tree Induction over Data Streams with Concept Drifts", IEEE ICBK, 2018. Best Student Paper Award . [local copy] [bib]
  • V. Unnikrishnan, C. Beyer, U. Niemann, P. Matuszyk, R. Pryss, W. Schlee, E. Ntoutsi and M. Spiliopoulou, "Entity-Level Stream Classification: Exploiting Entity Similarity to Label the Next Observation Referring to an Entity", IEEE DSAA, 2018. [local copy][bib]
  • D. Melidis, A. Veizaga Campero, V. Iosifidis, E. Ntoutsi, M. Spiliopoulou, "Enriching Lexicons with Ephemeral Words for Sentiment Analysis in Social Streams", WIMS, Novi Sad, Serbia, 2018. [local copy]
  • C. Beyer, U. Niemann, V. Unnikrishnan, E. Ntoutsi, M. Spiliopoulou, "Predicting Document Polarities on a Stream without Reading their Contents", SAC Data Streams track, Pau, France, 2018. [local copy] [bib]
  • V. Iosifidis, E. Ntoutsi, "Large scale sentiment annotation with limited labels", KDD, Halifax, Canada, 2017. [local copy][bib]
  • V. Iosifidis, A. Oelschlager, E. Ntoutsi, "Sentiment Classification over Opinionated Data Streams through Informed Model Adaptation", TPDL, Thessaloniki, Greece, 2017. [local copy]

Other material

Poster