Multilingual Media Summarization and Sentiment Analysis

Project description

News aggregators gather thousands of news articles every day from across the world and cluster them into news stories comprising large numbers of articles about the same event. This is even augmented by the increasing amount of information in social media where mass opinions about news events can be monitored. A promising way to reduce this bulk of highly redundant data is offered by the language technologies known as multidocument text summarisation and sentiment analysis. A major problem is that news articles are in many languages while current technology has mostly dealt with English, and the question of how current research can be applied in a heavily multilingual context has barely been addressed.

Most multi-document summarisation research has focused on the news domain and MediaGist will do likewise in order to build on existing techniques and resources. However, summarising posts in social media, representing complementary and unbiased information, will be considered as well. Media professionals, however, will want to go beyond summaries from sources in one language and consider how news events are reported in other countries and from other perspectives. Identifying differences in opinion towards entities and events may provide some clues as the disagreements in reporting across languages. MediaGist will perform multilingual sentiment analysis and will thus make possible the generation of summaries that reveal these disagreements.

The goal of MediaGist is to make significant advances in multilingual research so as to extract and present the GIST (the main content and opinions) of online multilingual news and the corresponding content in social media.

Recipient: University of West Bohemia, Pilsen, Czech Republic
Funded by: European Commission
Program: FP7 People Programme - Marie Curie Actions - Career Integration Grants.
Project: MediaGist, no. 630786
Duration: 03-2014 - 02-2018

Gu, Y., Anderson, A., Steinberger, J., Celli, F., Strapparava, C. and Poesio, M. (2014): Using Brain Data for Sentiment Analysis. In: Journal for Language Technology and Computational Linguistics 29(1), pages 79-94, ISSN 2190-6858.
Brychcín, T. and Konkol, M., Steinberger, J. (2014): UWB: Machine Learning Approach to Aspect-Based Sentiment Analysis. In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pages 817–822, ACL, Dublin, Ireland. ISBN 978-1-941643-24-2
Steinberger, J., Brychcín, T. and Konkol M.(2014): Aspect-Level Sentiment Analysis in Czech. In: Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pages 24-30, ACL. Baltimore, USA. ISBN 978-1-941643-11-2
Habernal, I., Ptáček, T. and Steinberger, J. (2014): Supervised Sentiment Analysis in Czech Social Media. In: Information Processing & Management 50(5), pages 693–707, ISSN 0306-4573, DOI: 10.1016/j.ipm.2014.05.001, Elsevier.
Steinberger, J., Tanev, H., Zavarella, V., Steinberger, R., Turchi, M. (2014): Aspects of Multilingual News Summarisation. In: Alessandro Fiori (ed.): Innovative Document Summarization Techniques: Revolutionizing Knowledge Understanding, Advances in Data Mining and Database Management series, pages 277-294, ISSN: 2327-1981, ISBN 978-1-4666-5019-0, DOI: 10.4018/978-1-4666-5019-0.ch012, IGI Global.
Giannakopoulos, G., Kubina, J., Conroy, J., and Steinberger, J., Favre, B., Kabadjov, M., Kruschwitz, U. and Poesio, M. (2015): MultiLing 2015: Multilingual Summarization of Single and Multi-Documents, On-line Fora, and Call-center Conversations. In: Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SigDial'15), pages 270-274. Prague, Czech Republic, ACL, ISBN 978-1-941643-75-4.
Steinberger, J., Tanev, H. (2015): Towards Multilingual Event Extraction Evaluation: A Case Study for the Czech Language. In: Proceedings of the 10th International Conference Recent Advances in Natural Language Processing (RANLP'15), pages 254-260. Hissar, Bulgaria, Incoma Ltd., ISSN 1313-8502.
Kabadjov, M., Steinberger, J., Barker, E., Kruschwitz, U. and Poesio, M. (2015): OnForumS: The Shared Task on Online Forum Summarisation at MultiLing’15. In: Proceedings of the Forum for Information Retrieval Evaluation (FIRE 2015), pages 21-26. ACM. ISBN: 978-1-4503-4004-5
Krejzl, P., Steinberger, J., Hercig, T., and Brychcin, T. (2016): Online Forum Summarization. In: Proceedings of Data a znalosti 2015, pages 85-88. VSB Ostrava. ISBN 978-80-248-3824-3.
Steinberger, J., Kabadjov, M. and Poesio, M. (2016): Coreference Applications to Summarization. To appear in: Massimo Poesio, Roland Stuckardt and Yannick Versley (eds.), Anaphora Resolution: Algorithms, Resources, and Evaluation, Chapter 15, Springer, ISBN: 978-3-662-47908-7.
Kabadjov, M., Kruschwitz, U., Poesio, M., Steinberger, J. and Zaragoza, H. (2016): The OnForumS corpus from the Shared Task on Online Forum Summarisation at MultiLing 2015. To appear in: Proceedings of the 10th Language Resources and Evaluation Conference, Portorož, Slovenia, May 2016.
Josef Steinberger, jstein[at]kiv[dot]zcu[dot]cz

University of West Bohemia

Faculty of Applied Sciences

Department of Computer Science and Engineering

NTIS - New Technologies for the Information Society, Research Center

Supported by

European Commission

Seventh Framework Programme