Do we need Multimodality? Experiments with Tweets from European Union Executives
Autor: | Özdemir, Sina, Schwabl, Patrick |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2022 |
Předmět: | |
DOI: | 10.5281/zenodo.7308244 |
Popis: | Content analysis has always been one of the key methods in communication research and advances in computational methods often deal with processing vast quantities of text. Yet, communication rarely happens via a single modality. For example, one of the key political actors in Europe, the European Union posts images in about 40% of its tweets (Özdemir & Rauh, 2022). Dictionary-based and shallow learning (SL) methods have a hard time incorporating multimodality into the analysis. Deep learning (DL) brings the possibility to extend content analysis to multimodal materials. Previous studies have demonstrated the flexibility of embeddings to analyze multimodel data (Li et al., 2022; Niu et al., 2019; Tseng et al., 2021; Wu & Mebane, 2022). In this paper, we evaluate the feasibility of using multimodal DL embeddings to classify political messages where the message is delivered with a combination of visual and textual modalities in a computational experiment. We build a series of unimodal SL models and multimodal DL embedding-based models to classify manually annotated tweets from European Union (EU) executives. We then compare the classification performance of these models. Our results indicate that multimodal signals are tricky to catch in a way that is meaningful to a classifier. Finally, we conclude with some recommendations for researchers who would like to use multimodal data in automated content analysis. Scripts can be found here: https://github.com/SinaOzdemir/ifkw_mmDL {"references":["Benoit, K., Watanabe, K., Wang, H., Nulty, P., Obeng, A., Müller, S., & Matsuo, A. (2018). quanteda: An R package for the quantitative analysis of textual data. Journal of Open Source Software, 3(30), 774. https://doi.org/10.21105/joss.00774.","Chollet, F. (2015). Keras. GitHub. https://github.com/fchollet/keras.","Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv:1810.04805 [Cs]. http://arxiv.org/abs/1810.04805.","Kuhn, M. (2008). Building Predictive Models in R Using the caret Package. Journal of Statistical Software, 28(5). https://doi.org/10.18637/jss.v028.i05.","Li, K., Zhang, Y., Li, K., Li, Y., & Fu, Y. (2022). Image-Text Embedding Learning via Visual and Textual Semantic Reasoning. IEEE Transactions on Pattern Analysis and Machine Intelligence, PP. https://doi.org/10.1109/TPAMI.2022.3148470.","Li, L. H., Yatskar, M., Yin, D., Hsieh, C.-J., & Chang, K.-W. (2019). VisualBERT: A Simple and Performant Baseline for Vision and Language. ArXiv:1908.03557 [Cs]. http://arxiv.org/abs/1908.03557.","Niu, Y., Lu, Z., Wen, J.-R., Xiang, T., & Chang, S.-F. (2019). Multimodal Multi-Scale Deep Learning for Large-Scale Image Annotation. IEEE Transactions on Image Processing, 28(4), 1720–1731. https://doi.org/10.1109/TIP.2018.2881928.","Özdemir, S., & Rauh, C. (2022). A Bird's Eye View: Supranational EU Actors on Twitter. Politics and Governance, 10(1), 133–145. https://doi.org/10.17645/pag.v10i1.4686.","Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine Learning in Python. The Journal of Machine Learning Research, 12, 2825–2830.","Tseng, S.-Y., Narayanan, S., & Georgiou, P. (2021). Multimodal Embeddings From Language Models for Emotion Recognition in the Wild. IEEE Signal Processing Letters, 28, 608–612. https://doi.org/10.1109/LSP.2021.3065598.","Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Scao, T. L., Gugger, S., … Rush, A. M. (2020). HuggingFace's Transformers: State-of-the-art Natural Language Processing. ArXiv:1910.03771 [Cs]. http://arxiv.org/abs/1910.03771.","Wu, P. Y., & Mebane, W. R. (2022). MARMOT: A Deep Learning Framework for Constructing Multimodal Representations for Vision-and-Language Tasks. Computational Communication Research, 4(1). https://doi.org/10.5117/CCR2022.1.008.WU.","Wu, Y., Kirillov, A., Massa, F., Lo, W.-Y., & Girshick, R. (2019). Detectron2. https://github.com/facebookresearch/detectron2."]} |
Databáze: | OpenAIRE |
Externí odkaz: |