ICML 2026 Past Evaluation & benchmarks

Culture x AI: Evaluating AI as a Cultural Technology (ICML 2026)

Culture x AI 2026

Submission deadline
TBA — know the deadline? Add it in one line
The file opens with a ready-to-fill template — takes about a minute.
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (63)

Fetched from OpenReview (v2) on 2026-06-10.

  1. “AI is (not) the new..”: A Diagnostic Analogy Framework for Generative AI’s Cultural Impacts

    Rida Qadri, Vinodkumar Prabhakaran, Remi Denton
  2. A Charter for Cultural AI Evaluation: Methodological Principles for Long-Tail, Cross-Cultural Tasks

    Federico Pianzola, Arianna Graciotti
  3. A Vision for Cultural Alignment: Opportunities and Safety Imperatives for AI in Mental Health Support

    Ratna Kandala, Akshata Kishore Moharir, Niva Manchanda, Samantha Adorno
  4. Agonistic AI: Advancing Interpretive Pluralism in the Cultural AI Value Space

    Tessa Haining
  5. AI as Cultural Mediation: Agentic Sanskrit–English Translation with Linguistic Grounding

    Jintao Ma, Junwen SHEN, Xinyue WANG, Leqi LIU, Dengkui Hou, lingxiang hu, nicolas turenne, Dun Li
  6. AI-Assisted Video Montage as Coordination: Design Guidelines for Platforms of Interactive Agent-based Multimodal Synthesis

    Luís Arandas, Mick Grierson
  7. Beyond Bias: Evaluating Cultural AI Through Participation and Interpretation

    Archana Prasad
  8. Beyond Hallucination: Evaluating Cultural and Institutional Misinterpretation in Public-Facing LLMs

    Oleh Bohatov
  9. Caesar Speaks Again: Bringing Historical Characters to Life using AI-Driven Avatars for Immersive Cultural Heritage in AR

    Stephen Uzor
  10. Care Is Not a Style Transfer Task: Evaluating Culturally Grounded Clinical AI

    Priyanshi Garg
  11. Causal Mechanisms of the Gender Pay Gap

    Sarah Razack, Brandon Yee, Pairie Koh, Jiayi Fu
  12. Code-Switching Reveals Anchor Bias in Multilingual Large Language Models

    Jeonghyun Park, Seunghyun Yoon, Hwanhee Lee
  13. Consensus Is Not Enough: Disagreement-Preserving Evaluation for Cultural AI

    Robert Sneiderman
  14. Cultural Fermentation: on Craft, Ecology, Listening, and Safety

    Luisa Ji
  15. Cultural Fidelity in English-to-Hindi Translation: A Preservation–Fluency Frontier for Gender Recoverability

    Samyak Savi, Chavi Gupta, Shreyas Gantayet, Tanay Sodha, Dhruv Kumar
  16. Culturally-Adapted Red-Teaming Across East and Southeast Asian Contexts: A Methodological and Comparative Analysis

    hyeji choi, YongTaek Lim, Minwoo Kim
  17. CuPS: Measuring Cultural Preference Signatures in LLM/VLM Agents and Their Steering by Profile Memories

    Kyeong Seon Kim, GeonU Kim, Joohyun Chang, Hyeyeon Kim, Tae-Hyun Oh
  18. Detecting and Mitigating Bias by Treating Fairness as a Symmetry Operation

    Nishit Singh
  19. Does Persona Make LLM a K-pop Fan? A Pilot Study of LLM-Based Online Concert Audience Agents

    Kirak Kim, Hyojin Kim, Yejin Son, Sungyoung Kim, Kyung Myun Lee
  20. Environmental Slow AI: Design Principles for Generative Systems

    Vanessa Utz
  21. Evolution of Cooperation in LLM Societies : A Multi-Lingual Examination

    Kriti Mahajan
  22. Fine-Tuning as Repair? Care Ethics and Situated Knowledges in LLM Alignment Cultures

    Lara Dal Molin, Jacqueline Rowe
  23. From Error Detection to Cultural Legibility: Human-AI Cooperation for Trauma-Informed Heritage Education in Conflict Zones

    Ying Tang, Argya Hanisi, Tia Dwi S, Irfani Aura Salsabila, Inria Astari Zahra
  24. From Style to Cultural Calibration: Evaluating Institutional Voice in LLM-Generated News

    Jiahang Luo
  25. GlotOCR Bench: OCR Models Still Struggle Beyond a Handful of Unicode Scripts

    Amir Hossein Kargaran, Nafiseh Nikeghbal, Jana Diesner, François Yvon, Hinrich Schuetze
  26. IndicDB - Benchmarking Multilingual Text-to-SQL Capabilities in Indian Languages

    Aviral Dawar, Roshan Karanth, Vikram Goyal, Dhruv Kumar
  27. Injecting Knowledge from Social Science Journals to Improve Indonesian Cultural Understanding by LLMs

    Adimulya Kartiyasa, Bao G. Cao, Boyang Li
  28. Interpretive Anchoring for Culturally Situated LLM Evaluation

    Cheng Wu, Vishal Anand, Jaya Krishna Mandivarapu, Xiya Liu, Rui Zhuang
  29. KG-FairDiff: Knowledge Graph-Guided Prompt Refinement for Demographically Fair Text-to-Image Generation

    Farbod Davoodi, seyedreza tavakoli, Pooriya Safaei, Sana Harighi, Parsa Gholami, Amirali Amini, Kimia Vanaei, Emad Firoozi, Parham Abed Azad, Babak Khalaj, Siavash Ahmadi, Amir H. Payberah, Mohammad Hossein Rohban, Mehdi Noroozi, Soheil Kolouri, Ali Diba
  30. Korean Culture into LLM Alignment: From Refusal to Cultural Coherence

    MIN JAE JUNG, Minwoo Kim
  31. LLMs Exhibit Significantly Lower Uncertainty in Creative Writing Than Professional Writers

    Peiqi Sui
  32. Manga109-v2026: Revisiting Manga109 Annotations for Modern Manga Understanding

    Jeonghun Baek, Atsuyuki Miyai, Shota Onohara, Hikaru Ikuta, Kiyoharu Aizawa
  33. Mise en Place for Taste: Recipes, Connoisseurship, and Cultural Competence in Generative AI

    Jun Li
  34. NarrativeWorldBench: A Frontier-Saturated Benchmark and a Latent World Model for Long-Horizon Co-Creative Audio Drama

    Logan Mann, Abdur Rahman, Mohammad Saifullah, Taaha Kazi, Vasu Sharma
  35. Operative Contexts: Belief Revision and Memory in Agentic AI

    Emma Cabalé, Selina Guter, Philippe Beraud, Philippe Limantour
  36. PAUSE: Editable Strategy Artifacts for Long-Form Cultural Story Adaptation

    Taaha Kazi, Vasu Sharma, Mohammad Saifullah, Abdur Rahman
  37. Plural Voices: A Cultural Contestability Framework for Evaluating AI-Mediated Service Work

    Adrian Mak, Supheakmungkol Sarin, Edward Tsoi, Wing-Yee Lau, Alejandro Reyes
  38. Reading Models’ Self-Defense: Narratology as Legibility Instrument for Cultural AI Evaluation

    Seohyon Jung, Songeun Chae, Donghoon Jung, Jiwoo Choi
  39. Repertoires, Not Scores: Instability as Signal in Cultural Evaluation of LLMs

    Suchir Salhan, Filip Trhlík, Diana Galvan-Sosa, Paula Buttery
  40. Robustness of Cultural Norm Reasoning Under Language and Context Perturbations

    Ankita Maity, Sajag Swami, Van Ngo, Akhil Arora, Nikita Moghe
  41. SAFE: Segment-Aware Filtering and Evaluation for Lyric Content Moderation

    Peng Zhang, Jiawen Xie, Zihan Su
  42. SEA-MU: Cultural Meme Understanding Benchmark for Southeast Asia

    Bao G. Cao, Adimulya Kartiyasa, Ponpavi Sangsuradej, Boyang Li
  43. Spoiler Alert: Narrative Forecasting as a Metric for Tension in LLM Storytelling

    Peiqi Sui, Yutong Zhu, Tianyi Cheng, Peter West, Richard Jean So, Hoyt Long, Ari Holtzman
  44. Stress-Testing Emotional Support Models: Moving from Homogeneous to Diverse Help Seekers

    Chaewon Heo, Cheyon Jin, Yohan Jo
  45. StylisticBias: A Few Human Visual Cues Drive Most Social Bias in MLLMs

    Shaghayegh Kolli, Timo Cavelius, Nafiseh Nikeghbal, Samantha Dalal, Jana Diesner
  46. The Homogenization Problem in LLMs: Towards Meaningful Diversity in AI Safety

    Ian Rios-Sialer
  47. The Language of Bargaining: Linguistic Effects in LLM Negotiations

    Stuti Sinha, Himanshu Kumar, Aryan Raju Mandapati, Rakshit Sakhuja, Dhruv Kumar
  48. The Modular Encyclopedia: LLMs and the Assemblage of Cultural Knowledge

    Giulia Taurino
  49. The Time of the Latent: Evaluating Cultural AI Through Human–AI Creative Trajectories

    Manuela Violi
  50. Three Years of r/ChatGPT: Societal Impact Evaluations from Social Media Data

    Jessica Dai
  51. Tokenization as Cultural Erasure: How Corpus Composition Shapes the Representation of Aymara Morphology in NLP Systems

    Bruno Fernando Silva Plata
  52. Toward a 21st Century Turing Test: Games, Authority, and Interpretive Intelligence in AI

    Thomas Gaskin, Richard Jean So, Milena Tsvetkova
  53. Towards A New Toolkit for Measuring AI-Enabled Influence Operations

    Shannon Yang
  54. What Could Cézanne Have Painted? Geometric Search for Stylistic Gaps in Embedding Spaces

    Fernando Aguilar-Canto, Hiram Calvo, Ricardo Menchaca-Mendez
  55. What Do Historical Language Models Model?

    Thierry Poibeau
  56. What does a surplus of interpretations consume?

    Andrew Buzzell
  57. What Gets Lost When Memory Becomes Media? Evaluating AI-Generated Oral History Visualization

    KWANGSUK PARK, Jaehyun, Jiyeon Lee, Anjung Tan, Hyoungchul park
  58. What If Chinese Were Latinized? A Counterfactual Study of Script, Tokenization, and Language Modeling

    Zijie Zheng, Ej Zhou
  59. What Makes AI a Good Cultural Mediator? Evidence from Literary Paratexts

    Zhou Mengyuan
  60. When East Asia Loses Its Names: Interpreting Neighborhood Effect and Cultural Generalization in Vision-Language Models

    Youngsik Yun, Yusang Cho, Jihie Kim
  61. When Perspective Becomes Control: Verifying Role-Conditioned Image Generation

    Hyunsuk Chung, Caren Han, Kyungreem Han, Sang-Wook Yi
  62. Where Models Concentrate and Humans Spread: Toward Cultural Reach in Generative AI

    Zini Yang, Richard Jean So, Emily Wenger
  63. Whose Interpretation Counts? Reading Generative AI as an Interpretive Technology Across UK and Indian Households

    Varad Vishwarupe, Professor Marina Jirotka, Samruddhi Saoji, Shwetanshu Shekhar, Gururaj Shinde, Ritu Kuklani, Meshari M Alwazae, Haazique Sayyed