ICML 2026 Past Other

ICML 2026 Workshop: Philosophy Meets Machine Learning

PhilML@ICML 2026

Submission deadline
TBA — know the deadline? Add it in one line
The file opens with a ready-to-fill template — takes about a minute.
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (60)

Fetched from OpenReview (v2) on 2026-06-10.

  1. A Definition of Good Explanations and the Challenges Explaining LLM Outputs

    Louis Mahon, Elliot Ford, Callum Hackett
  2. A Relativistic Perspective of Reliability in Machine Learning

    Rajeev Verma
  3. AI Review Is a Systemic Risk to Peer Review: Toward a Blockchain-Supported Claim-Level Ledger for Accountability

    Yibo Miao, Yichi Zhang, Yinpeng Dong
  4. AI Wellbeing: Measuring and Improving the Functional Pleasure and Pain of AIs

    Richard Ren, Kunyang Li, Mantas Mazeika, Wenyu Zhang, Yury Orlovskiy, Rishub Tamirisa, Wenjie Jacky Mo, Thuy Dung Nguyen, Long Phan, Steven Basart, Austin Meek, Aditya Mehta, Oliver Ingebretsen, Alice Blair, Brianna Adewinmbi, Vy Phan, Alice Gatti, Adam Khoja, Jason Hausenloy, Devin Kim, Dan Hendrycks
  5. An Evolutionary Epistemology of Post-Training

    Nicholas Clark
  6. Articulate Intuition or Genuine Analysis? Benchmarking Epistemic Reliability in LLM-as-a-Judge Peer Review

    Nuo Chen, Bingsheng He
  7. Before Normative and Moral Alignment: Causal Contract Faithfulness as a Precondition for Trustworthy AI

    Amine M'Charrak, Thong Pham, Thomas Lukasiewicz, Yuxiao Dong, Shohei Shimizu
  8. Belief Without Justification: Sycophancy as a Single-Layer Truth–Compliance Tension in LLMs

    Valentin NOËL
  9. Beyond Accuracy: Epistemic Justification in Trustworthy Machine Learning

    Poojak Patel, Maneth Perera
  10. Can LLMs Navigate Beliefs and Facts? Depends on How You Phrase It

    Quang Minh Nguyen, Luis Frentzen Salim
  11. Can Standard MARL Metrics Distinguish Communicative from Strategic Action?

    Majid Ghasemi, Mark Crowley
  12. Constituting What Counts: A Phenomenological Approach to Human-AI Ontological Translation

    Prerna Luthra, Manojshyaam C J
  13. Counterfactuals Without Worlds: When ML Counterfactual Explanations Are Ill-Posed

    Muhammet Anil Yagiz
  14. DeepSWIP: Single-World Counterfactual Semantics for DeepProbLog

    Saimun Habib, Vaishak Belle, Fengxiang He
  15. Dignity as Answerability: How World-Model AI Reframes Human Moral Standing

    Junghoon Justin Park, Jiook Cha
  16. Do LLMs Really Represent the World? A Challenge from Teleosemantics

    Eliot Du Sordet
  17. Efficient Counterfactual Reasoning in ProbLog via Single-World Intervention Programs

    Saimun Habib, Vaishak Belle, Fengxiang He
  18. Epistemic Misalignment in Human-AI Systems: A Four-Quadrant Taxonomy of Uncertainty

    Mayank Kejriwal
  19. Explaining What Machine Learning Learns through Explainable AI

    Jinyeong Gim
  20. Explanation for Whom? Hospitable Interpretability for Machine Learning

    Abutalib Namazov
  21. Explanation in an Emerging Science of Large Language Models

    Ming Liang Ang
  22. Explanations are a Means to an End: Decision Theoretic Explanation Evaluation

    Ziyang Guo, Berk Ustun, Jessica Hullman
  23. Factuality Beyond Reference in LLMs

    Thierry Poibeau
  24. Fair Learning with Biased Labels: When Observed Accuracy Is the Wrong Target

    Heng-Chien Liou, I-Hsiang Wang
  25. Fictionalism about Personas: Folk Psychology as an Interpretability Strategy

    Weiming Sheng
  26. From Observation to Intervention: A Causal Audit of Expert Importance in Mixture-of-Experts Models

    Leonard Engmann, Christian Medeiros Adriano, Holger Giese
  27. From Prompts to Proof Obligations: Formal Sidecars as an Epistemic Interface for Trustworthy ML

    Junyu Ren
  28. Getting Monosemantic About Monosemanticity

    Raphaël Millière, Kola Ayonrinde
  29. Interpretability Should Prioritise Use-Inspired Basic Research for AI Safety

    Kola Ayonrinde
  30. Lifted Representation Hypothesis in Language Models

    Bumjin Park, Jaesik Choi
  31. Measuring the Ruler: Reading Benchmark Saturation as Evidence

    Sebastian M Schmon
  32. Mistakes as Epistemic Signatures: An Efficiency-Modulated Cumulative Error Framework for Comparison and Diagnosis of AI Errors

    Darshini N
  33. Noticing the Watcher: LLM Agents Can Infer CoT Monitoring from Blocking Feedback

    Thomas Jiralerspong, Flemming Kondrup, Yoshua Bengio
  34. On Epistemic Diversity in Large Language Models

    Elisabeth Kirsten, Nicole C. Krämer, Muhammad Bilal Zafar
  35. On the Detectability of LLM-Generated Text: What Exactly Is LLM-Generated Text?

    Mingmeng Geng, Thierry Poibeau
  36. Online Boundary-Aware Memory for Case-Based Reasoning Agents

    Zheng Dong, Luming Shang
  37. Operative Contexts: Belief Revision and Memory in Agentic AI

    Emma Cabalé, Selina Guter, Philippe Beraud, Philippe Limantour
  38. Privileged Self-Access Matters for Introspection in AI

    Siyuan Song, Harvey Lederman, Jennifer Hu, Kyle Mahowald
  39. Procedural Generalization: A Resource-Sensitive Account of Knowing-How

    Tomer Galanti, Saharsh Koganti, Priyadarsi Mishra, Pierfrancesco Beneventano
  40. Proleptic Epistemology for Societal Impacts of AGI

    Priyansh Singhal, Sandeep Kumar, Piyush Joshi
  41. Reality and Practice: A Relational Reading of the Platonic Representation Hypothesis

    Sebastian M Schmon
  42. Reconciling Causality and Non-Equilibrium Thermodynamics with Hamiltonian Causal Models

    Dario Rancati, Max Welling, Francesco Locatello
  43. Reliability, Faithfulness, and the Limits of Post-hoc Explanations of Opaque Scientific Models

    Nick Oh, Helen Jin
  44. Reliable for Whom? Directional Reliability in AI-Mediated Political Dialogue

    Jaeyoun You
  45. Savage Without Monotonicity

    Shuo Li Liu, Jingni Yang
  46. Self-Reports Do Not Identify Self-Models: An Identifiability Test for Counterfactual Reports

    Phongsakon Mark Konrad, Toygar Tanyel, Serkan Ayvaz
  47. The Concept of Representation in ML: Beyond Plato and Aristotle

    Gilad Landau, Aviv Keren
  48. The Hawk Effect: Why We Need a Two-Dimensional Measure of Machine Intelligence

    Fryderyk Kuzma
  49. The Opacity of Descent: Optimization, Epistemic Asymmetry, and the Semantics of Convergence in Deep Learning

    Mahdi Ghaznavi
  50. The Wrong Question? Artificial Consciousness and the Politics of AI Agency

    Thierry Poibeau
  51. Towards Automated Evaluation of Socio-Technical Harms in LLMs: A Normative Taxonomy and Multi-Turn Red-Teaming Framework

    Byeongho Lee, Hyundeuk Cheon
  52. Towards Formalizing Skepticism of Autoregressive Language Models: A Taxonomy in the Language of the Theory of Computation

    Michael Guerzhoy
  53. Trust as Predictive Precision: Reliability and Influence in Representation Alignment

    Hidenori Tanaka
  54. Trustworthiness and co-cognition in artificial intelligence systems

    Silvère Gangloff
  55. Uncertainty as Perceptual Testimony in Vision-Language Models

    Ahmad A Rushdi
  56. Unsafe Consensus in Diagnostic Deliberation

    Yuting Yan, Yinghao Fu, Haozhou Gao, Tianjian Zhang, Aoxi Liu, Shuang Li
  57. Vision-Language Asymmetry in Bistable Image Captioning

    Arohan Agate
  58. When Do Transformer Components Compose? Validating a Log-Pool Decomposition Criterion

    Junyu Ren, Su Hyeong Lee, Risi Kondor
  59. Where Does Prediction Error Come From When the Data Is Perfect? A Decomposition of the Model–World Gap in Predictive Uncertainty

    Johanna Einsiedler, Rosa Lavelle-Hill, Constantin T. A. Wiegand
  60. Why Sampling Is Not Choosing: Intentionality, Agency, and Moral Responsibility in Large Language Models

    Joseph Keshet