UNESCO finds ‘pervasive’ gender bias in generative AI tools
Generative AI’s outputs still reflect a considerable amount of gender and sexuality based bias, associating feminine names with traditional gender roles, generating negative content about gay subjects, and more besides, according to a new report from UNESCO’s International Research Centre on Artificial Intelligence.
The report, published today, centered on several individual studies of bias, including tests for associations between gendered names and careers, frequently generated less positive responses to prompts related to LGBTQ+ individuals and women, and assigned stereotyped professions to members of different genders and ethnic groups.
The researchers found three major categories of bias underlying generative AI technologies. The first is a data issue, in which an AI isn’t exposed to training data from underrepresented groups or doesn’t account for differences in sex or ethnicity, which can lead to inaccuracies. The second is algorithm selection, which can result in aggregation or learning bias. The classic example of this would be an AI identifying resumes from male job candidates as more desirable based on gender-based disparities already present in hiring practices. Finally, the study identified biases in deployment, where AI systems were applied to different contexts than the ones they had been developed for, resulting in “improper” associations between psychiatric terms and specific ethnic groups or genders.
Each form of bias present within the large language models (LLMs) underpinning modern AI systems reflects the texts on which the LLMs are trained, the authors of the UNESCO report wrote in an introduction. Because these texts have been generated by humans, the LLMs, therefore, reflect human biases.
“Consequently, LLMs can reinforce stereotypes and biases against women and girls, practices through biased AI recruitment tools, gender-biased decision-making in sectors like finance (where AI might influence credit scoring and loan approvals), or even medical or psychiatric misdiagnosis due to demographically biased models or norms,” they wrote.
The researchers noted that their study was not without its limitations, discussing several potential challenges, including limitations on implicit association tests, data contamination, deployment bias, language limitation, and the lack of intersectional analysis.