This paper is available on arxiv under CC 4.0 license.
Authors:
(1) Muzhaffar Hazman, University of Galway, Ireland;
(2) Susan McKeever, Technological University Dublin, Ireland;
(3) Josephine Griffith, University of Galway, Ireland.
Table of Links
Conclusion, Acknowledgments, and References
A Hyperparameters and Settings
E Contingency Table: Baseline vs. Text-STILT
6 Conclusion
In this work, we addressed the challenge of training multimodal meme sentiment classifiers on a limited number of labelled memes by incorporating unimodal sentiment analysis data. We did so by proposing the first instance of STILT that applies unimodal intermediate tasks to a multimodal target task. Specifically, we tested image-only and text-only sentiment classification as intermediate tasks in training a meme sentiment classifier. We showed that this approach worked – unimodal text improved meme classification performance to a statistically significant degree. This novel approach allowed us to train a meme classifier that outperforms meme-only finetuning with only 60% as many labelled meme samples. As possible explanations for our observations, we discuss apparent similarities and differences in the roles of image and text modalities between unimodal and multimodal sentiment analysis tasks.
Acknowledgments
This work was conducted with the financial support of the Science Foundation Ireland Centre for Research Training in Digitally-Enhanced Reality (d-real) under Grant No. 18/CRT/6224; and the provision of computational facilities and support from the Irish Centre for High-End Computing (ICHEC).
References
Christian Bauckhage. 2011. Insights into internet memes. Proceedings of the International AAAI Conference on Web and Social Media, 5(1):42–49.
Efrat Blaier, Itzik Malkiel, and Lior Wolf. 2021. Caption enriched samples for improving hateful memes detection. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 9350–9358, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Ana-Maria Bucur, Adrian Cosma, and Ioana Iordache. 2022. BLUE at memotion 2.0 2022: You have my image, my text and my transformer. In De-Factify @ AAAI 2022. First Workshop on Multimodal FactChecking and Hate Speech Detection, CEUR Workshop Proceedings. AAAI.
Christopher Clark, Kenton Lee, Ming-Wei Chang, Tom Kwiatkowski, Michael Collins, and Kristina Toutanova. 2019. BoolQ: Exploring the surprising difficulty of natural yes/no questions. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2924–2936, Minneapolis, Minnesota. Association for Computational Linguistics.
CrowdFlower. 2016. Image sentiment polarity - dataset. Available at https://data.world/ crowdflower/image-sentiment-polarity. Accessed: 2023-01-15.
Baishan Duan and Yuesheng Zhu. 2022. Browallia at memotion 2.0 2022 : Multimodal memotion analysis with modified ogb strategies (short paper). In DeFactify @ AAAI 2022. First Workshop on Multimodal Fact-Checking and Hate Speech Detection, CEUR Workshop Proceedings. AAAI.
Noam Gal, Limor Shifman, and Zohar Kampf. 2016. “it gets better”: Internet memes and the construction of collective identity. New Media & Society, 18(8):1698–1714.
Yue Gu, Kangning Yang, Shiyu Fu, Shuhong Chen, Xinyu Li, and Ivan Marsic. 2018. Hybrid Attention
based Multimodal Network for Spoken Language Classification. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2379–2390, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Muzhaffar Hazman, Susan McKeever, and Josephine Griffith. 2023. Meme sentiment analysis enhanced with multimodal spatial encoding and face embedding. In Artificial Intelligence and Cognitive Science, pages 318–331, Cham. Springer Nature Switzerland.
Junguang Jiang, Yang Shu, Jianmin Wang, and Mingsheng Long. 2022. Transferability in deep learning: A survey. arXiv preprint, arXiv:2201.05867.
Vishal Keswani, Sakshi Singh, Suryansh Agarwal, and Ashutosh Modi. 2020. IITK at SemEval-2020 task 8: Unimodal and bimodal sentiment analysis of Internet memes. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 1135–1140, Barcelona (online). International Committee for Computational Linguistics.
Douwe Kiela, Hamed Firooz, Aravind Mohan, Vedanuj Goswami, Amanpreet Singh, Pratik Ringshia, et al. 2020. The hateful memes challenge: Detecting hate speech in multimodal memes. Advances in Neural Information Processing Systems, 33:2611–2624.
Liisi Laineste and Piret Voolaid. 2017. Laughing across borders: Intertextuality of internet memes. The European Journal of Humour Research, 4(4):26–49.
Gwang Gook Lee and Mingwei Shen. 2022. Amazon pars at memotion 2.0 2022: Multi-modal multi-task learning for memotion 2.0 challenge. In De-Factify @ AAAI 2022. First Workshop on Multimodal FactChecking and Hate Speech Detection, CEUR Workshop Proceedings. AAAI.
Quinn McNemar. 1947. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, 12(2):153–157.
Shreyash Mishra, S Suryavardan, Parth Patwa, Megha Chakraborty, Anku Rani, Aishwarya Reganti, Aman Chadha, Amitava Das, Amit Sheth, Manoj Chinnakotla, Asif Ekbal, and Srijan Kumar. 2023. Memotion 3: Dataset on sentiment and emotion analysis of codemixed hindi-english memes. arXiv preprint, arXiv:2303.09892.
Niklas Muennighoff. 2020. Vilio: state-of-the-art visiolinguistic models applied to hateful memes. arXiv preprint arXiv:2012.07788.
Thanh Van Nguyen, Nhat Truong Pham, Ngoc Duy Nguyen, Hai Nguyen, Long H. Nguyen, and YongGuk Kim. 2022. Hcilab at memotion 2.0 2022: Analysis of sentiment, emotion and intensity of emotion classes from meme images using single and multi modalities (short paper). In De-Factify @ AAAI 2022. First Workshop on Multimodal Fact-Checking and Hate Speech Detection, CEUR Workshop Proceedings. AAAI.
Asaf Nissenbaum and Limor Shifman. 2017. Internet memes as contested cultural capital: The case of 4chan’s /b/ board. New Media & Society, 19(4):483– 501.
Parth Patwa, Sathyanarayanan Ramamoorthy, Nethra Gunti, Shreyash Mishra, S Suryavardan, Aishwarya Reganti, et al. 2022. Findings of memotion 2: Sentiment and emotion analysis of memes. In De-Factify @ AAAI 2022. First Workshop on Multimodal FactChecking and Hate Speech Detection, CEUR Workshop Proceedings. AAAI.
Kim Ngan Phan, Gueesang Lee, Hyung-Jeong Yang, and Soo-Hyung Kim. 2022. Little flower at memotion 2.0 2022 : Ensemble of multi-modal model using attention mechanism in memotion analysis (short paper). In De-Factify @ AAAI 2022. First Workshop on Multimodal Fact-Checking and Hate Speech Detection, CEUR Workshop Proceedings. AAAI.
Jason Phang, Thibault Fevry, and Samuel R. Bowman. ´ 2019. Sentence encoders on stilts: Supplementary training on intermediate labeled-data tasks. arXiv preprint, arXiv:1811.01088.
Mohammad Taher Pilehvar and Jose Camacho-Collados. 2019. WiC: the word-in-context dataset for evaluating context-sensitive meaning representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1267–1273, Minneapolis, Minnesota. Association for Computational Linguistics.
Clifton Poth, Jonas Pfeiffer, Andreas Ruckl ¨ e, and Iryna ´ Gurevych. 2021. What to pre-train on? Efficient intermediate task selection. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10585–10605, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Christopher Potts, Zhengxuan Wu, Atticus Geiger, and Douwe Kiela. 2021. DynaSent: A dynamic benchmark for sentiment analysis. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 2388–2404, Online. Association for Computational Linguistics.
Shraman Pramanick, Md Shad Akhtar, and Tanmoy Chakraborty. 2021a. Exercise? i thought you said ’extra fries’: Leveraging sentence demarcations and multi-hop attention for meme affect analysis. Proceedings of the International AAAI Conference on Web and Social Media, 15(1):513–524.
Shraman Pramanick, Shivam Sharma, Dimitar Dimitrov, Md. Shad Akhtar, Preslav Nakov, and Tanmoy Chakraborty. 2021b. MOMENTA: A Multimodal Framework for Detecting Harmful Memes and Their Targets. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 4439–4455, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Yada Pruksachatkun, Jason Phang, Haokun Liu, Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, and Samuel R. Bowman. 2020. Intermediate-task transfer learning with pretrained language models: When and why does it work? In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5231–5247, Online. Association for Computational Linguistics.
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, et al. 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, pages 8748–8763. PMLR.
Sathyanarayanan Ramamoorthy, Nethra Gunti, Shreyash Mishra1, S Suryavardan, Aishwarya Reganti, Parth Patwa, Amitava Das, Tanmoy Chakraborty, Amit Sheth, Asif Ekbal, and Chaitanya Ahuja. 2022. Memotion 2: Dataset on sentiment and emotion analysis of memes. In De-Factify @ AAAI 2022. First Workshop on Multimodal Fact-Checking and Hate Speech Detection, CEUR Workshop Proceedings. AAAI.
Elad Segev, Asaf Nissenbaum, Nathan Stolero, and Limor Shifman. 2015. Families and Networks of Internet Memes: The Relationship Between Cohesiveness, Uniqueness, and Quiddity Concreteness. Journal of Computer-Mediated Communication, 20(4):417–433.
Lanyu Shang, Yang Zhang, Yuheng Zha, Yingxi Chen, Christina Youn, and Dong Wang. 2021. AOMD: An Analogy-Aware Approach to Offensive Meme Detection on Social Media. Inf. Process. Manage., 58(5).
Chhavi Sharma, Deepesh Bhageria, William Scott, Srinivas PYKL, Amitava Das, Tanmoy Chakraborty, et al. 2020. SemEval-2020 task 8: Memotion analysisthe visuo-lingual metaphor! In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 759–773, Barcelona (online). International Committee for Computational Linguistics.
Shivam Sharma, Mohd Khizir Siddiqui, Md. Shad Akhtar, and Tanmoy Chakraborty. 2022. Domainaware self-supervised pre-training for label-efficient meme analysis. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 792–805, Online only. Association for Computational Linguistics.
Limor Shifman. 2014. The cultural logic of photo-based meme genres. Journal of Visual Culture, 13(3):340–358.
Pranaydeep Singh, Aaron Maladry, and Els Lefever. 2022. Combining language models and linguistic information to label entities in memes. In Proceedings of the Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situations, pages 35–42, Dublin, Ireland. Association for Computational Linguistics.
Shardul Suryawanshi, Bharathi Raja Chakravarthi, Pranav Verma, Mihael Arcan, John Philip McCrae,
and Paul Buitelaar. 2020. A dataset for troll classification of TamilMemes. In Proceedings of the WILDRE5– 5th Workshop on Indian Language Data: Resources and Evaluation, pages 7–13, Marseille, France. European Language Resources Association (ELRA).
Cagri Toraman, Furkan S¸ahinuc¸, and Eyup Yilmaz. 2022. Large-scale hate speech detection with crossdomain transfer. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 2215–2225, Marseille, France. European Language Resources Association.
Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen, Benjamin Van Durme, Edouard Grave, Ellie Pavlick, and Samuel R. Bowman. 2019. Can you tell me how to get past sesame street? sentence-level pretraining beyond language modeling. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4465–4476, Florence, Italy. Association for Computational Linguistics.
Ron Zhu. 2020. Enhance multimodal transformer with external label and in-domain pretrain: Hateful meme challenge winning solution. arXiv preprint, arXiv:2012.08290.
Yan Zhuang and Yanru Zhang. 2022. Yet at memotion 2.0 2022 : Hate speech detection combining bilstm and fully connected layers (short paper). In De-Factify @ AAAI 2022. First Workshop on Multimodal Fact-Checking and Hate Speech Detection, CEUR Workshop Proceedings. AAAI.