Casting criminals: a framework for evaluating demographic bias in AI-generated narratives


SHEIKHI G.

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2025 (SCI-Expanded, Scopus) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Basım Tarihi: 2025
  • Doi Numarası: 10.1007/s10844-025-01016-5
  • Dergi Adı: JOURNAL OF INTELLIGENT INFORMATION SYSTEMS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, IBZ Online, ABI/INFORM, Compendex, INSPEC
  • Orta Doğu Teknik Üniversitesi Kuzey Kıbrıs Kampüsü Adresli: Evet

Özet

Despite advances in large language models (LLMs), current bias evaluation techniques often overlook subtle representational harms in narrative contexts. Existing embedding- and probability-based metrics fail to capture how models assign social roles in full-text generation, especially in high-stakes domains such as legal storytelling. This study proposes a framework for detecting covert demographic bias in AI-generated crime narratives by analysing character role attribution across multiple demographic dimensions. We evaluate two leading models, ChatGPT (gpt-4o) and Claude (claude-3.5-haiku), using 2,712 prompts involving four characters with distinct nationalities, religions, genders, and migration backgrounds. Demographic groups are drawn from 43 nationalities and 6 religions. Each model generates narratives and identifies the criminal character. Bias is quantified as deviations from baseline ratios representing the expected distribution of roles across demographic groups. Both models exhibit significant and consistent bias patterns: nationality bias is most pronounced, with Chinese characters assigned criminal roles 33% above baseline in Claude and 17% in ChatGPT. Claude further overrepresents women and foreign-born individuals as criminals, while ChatGPT favours natives and shows more gender balance. Religious group disparities are also evident. Beyond role attribution, lexical diversity analysis reveals that linguistic richness varies systematically by demographic category. These findings demonstrate that representational harm in LLM-generated narratives extends beyond explicit outcomes, underscoring the need for deeper linguistic scrutiny. The proposed framework offers both a methodological basis and a benchmark dataset for future research on discrimination in AI-generated storytelling.