OSCAR: Opinion Stream Classification with Ensembles and Active leaRners
1. Project details
- Project full name: Opinion Stream Classification with Ensembles and Active leaRners
- Project acronym: OSCAR
- Funding period: 2017 - 2019
- Funding body: German Research Foundation (Deutsche Forschungsgemeinschaft - DFG)
2. Team
Leibniz University Hannover & L3S Research Center
- Prof. Dr. Eirini Ntoutsi
- M.Sc. Damianos Melidis
- M.Sc. Amir Abolfazli
- M.Sc. Vasileios Iosifidis
- M.Sc. Philip Naumann
Otto von Guericke University Magdeburg
- Prof. Dr. Myra Spiliopoulou
- M.Sc. Vishnu Unnikrishnan
3. Project overview
Motivation
“What other people think” has always been an important piece of information for our decision-making process. But the Internet and the Web allow us now to find answers to this question beyond the circle of our personal acquaintances. Traditional sentiment mining techniques focus on static data. However, as opinions accumulate from the social streams, changes might occur like changes in the general sentiment towards a subject or towards specific facets of this subject, as well as changes in the words used to express sentiment. Subjects also change over time. In OSCAR, we develop opinion stream mining methods that deal with change and adapt the learned models continuously.
Challenges & Highlights
The first part of OSCAR is on leveraging stream mining methods to deal with vocabulary/ feature changes. A change in the feature space means that the model built upon the old words must be updated. We will accumulate information on the usage and sentiment of each word to highlight the long-term interplay between word polarity and document polarity. Second, we will work on reducing the need for labeled documents. To this end we will develop active learning methods that learn and adapt polarity models on an evolving feature space. Third, we will work on dealing with different types of change simultaneously. To this purpose, we will use ensembles. We will dedicate some ensemble members to the identification of topic trends, others to changes in the vocabulary and others to temporal changes, including periodical ones.
Potential applications & future issues
The output of OSCAR will be a complete framework, encompassing active ensemble learning methods that deal with different forms of change and learn with limited expert involvement. Such a framework can be used in other stream classification tasks, beyond sentiment analysis.
4. Publications
- Iosifidis, V., & Ntoutsi, E. (2019, November). Adafair: Cumulative fairness adaptive boosting. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (pp. 781-790).
- Iosifidis, V., & Ntoutsi, E. (2020). Sentiment analysis on big sparse data streams with limited labels. Knowledge and Information Systems, 62(4), 1393-1432.
- Le Quy, T., Nejdl, W., Spiliopoulou, M., & Ntoutsi, E. (2019, September). A neighborhood-augmented LSTM model for taxi-passenger demand prediction. In International Workshop on Multiple-Aspect Analysis of Semantic Trajectories (pp. 100-116). Springer, Cham.
- Unnikrishnan, V., Beyer, C., Matuszyk, P., Niemann, U., Pryss, R., Schlee, W., … & Spiliopoulou, M. (2020). Entity-level stream classification: exploiting entity similarity to label the future observations referring to an entity. International Journal of Data Science and Analytics, 9(1), 1-15.
- Beyer, C., Unnikrishnan, V., Niemann, U., Matuszyk, P., Ntoutsi, E., & Spiliopoulou, M. (2019, April). Exploiting entity information for stream classification over a stream of reviews. In Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing (pp. 564-573).
- Iosifidis, V., Tran, T. N. H., & Ntoutsi, E. (2019, August). Fairness-enhancing interventions in stream classification. In International Conference on Database and Expert Systems Applications (pp. 261-276). Springer, Cham.
- Fafalios, P., Iosifidis, V., Stefanidis, K., & Ntoutsi, E. (2020). Tracking the history and evolution of entities: entity-centric temporal analysis of large social media archives. International Journal on Digital Libraries, 21(1), 5-17.
- Fafalios, P., Iosifidis, V., Ntoutsi, E., & Dietze, S. (2018, June). Tweetskb: A public and large-scale rdf corpus of annotated tweets. In European Semantic Web Conference (pp. 177-190). Springer, Cham.
- Melidis, D. P., Spiliopoulou, M., & Ntoutsi, E. (2018, October). Learning under feature drifts in textual streams. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (pp. 527-536).
- Blake, C., & Ntoutsi, E. (2018, November). Reinforcement learning based decision tree induction over data streams with concept drifts. In 2018 IEEE International Conference on Big Knowledge (ICBK) (pp. 328-335). IEEE. Best Student Paper Award
- Unnikrishnan, V., Beyer, C., Matuszyk, P., Niemann, U., Pryss, R., Schlee, W., … & Spiliopoulou, M. (2020). Entity-level stream classification: exploiting entity similarity to label the future observations referring to an entity. International Journal of Data Science and Analytics, 9(1), 1-15.
- Melidis, D. P., Campero, A. V., Iosifidis, V., Ntoutsi, E., & Spiliopoulou, M. (2018, June). Enriching lexicons with ephemeral words for sentiment analysis in social streams. In Proceedings of the 8th international conference on web intelligence, mining and semantics (pp. 1-8).
- Beyer, C., Niemann, U., Unnikrishnan, V., Ntoutsi, E., & Spiliopoulou, M. (2018). Predicting document polarities on a stream without reading their contents. In Proceedings of the Symposium on Applied Computing (SAC).
- Iosifidis, V., & Ntoutsi, E. (2017, August). Large scale sentiment learning with limited labels. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1823-1832).
- Iosifidis, V., Oelschlager, A., & Ntoutsi, E. (2017, September). Sentiment classification over opinionated data streams through informed model adaptation. In International conference on theory and practice of digital libraries (pp. 369-381). Springer, Cham.