NeurIPS 2025PastHealthcare & biology

The Second Workshop on GenAI for Health: Potential, Trust, and Policy Compliance

GenAI4Health 2025

Official website ↗OpenReview venue ↗See all NeurIPS workshops →✎ Edit this entry

Submission deadline: Sep 6, 2025, 11:59 UTC
imported from OpenReview — check the website for extensions
Submission portal: OpenReview
Notes: Topics were auto-suggested and may be imprecise — edits welcome.

Accepted papers (99)

Fetched from OpenReview (v2) on 2026-06-10.

3D Brain MRI Generation with a Clinically-Conditioned VAE-GAN and Diffusion-Driven Feature Sampling
Najmeh Mashhadi, Emmanouil Nikolakakis, Razvan Marinescu · PDF
An Interactive Framework for Generating Clinical Data with Human Feedback
Yu Yang, Jiafeng Song, Zhishuai Liu, Henry P Foote, Rishikesan Kamaleswaran, Pan Xu · PDF
Application of Whisper in Clinical Practice: the Post-Stroke Speech Assessment during a Naming Task
Milena Davudova, Ziyuan Cai, Valentina Giunchiglia, Dragos-Cristian Gruia, Giulia Sanguedolce, Adam Hampshire, Fatemeh Geranmayeh · PDF
ArtifactGen: Benchmarking WGAN-GP vs Diffusion for Label-Aware EEG Artifact Synthesis
Hritik Arasu, Faisal R Jahangiri · PDF
Automatic Correction of AI Reports using Fact-Checking Model-guided LLMs
Raziuddin Mahmood, Pingkun Yan, Tanveer F Syeda-Mahmood · PDF
Balancing Safety and Helpfulness in Healthcare AI Assistants through Iterative Preference Alignment
Huy Nghiem, Swetasudha Panda, Devashish Khatwani, Huy V. Nguyen, Krishnaram Kenthapadi, Hal Daumé III · PDF
Beyond Distillation: Pushing the Limits of Medical LLM Reasoning with Minimalist Rule-Based RL
Che Liu, Haozhe Wang, Jiazhen Pan, Zhongwei Wan, Yong Dai, Fangzhen Lin, Wenjia Bai, Daniel Rueckert, Rossella Arcucci · PDF
Beyond Overall Accuracy: A Psychometric Deep Dive into the Topic-Specific Medical Capabilities of 80 Large Language Models
Zhimeng Luo, Lixin Wu, Adam Frisch, Daqing He · PDF
Bridging Graph and State-Space Modeling for Intensive Care Unit Length of Stay Prediction
Shuqi Zi, Haitz Sáez de Ocáriz Borde, Emma Charlotte Rocheteau, Pietro Lio · PDF
Brittleness and Promise: Knowledge Graph–Based Reward Modeling for Diagnostic Reasoning
Saksham Khatwani, He Cheng, Majid Afshar, Dmitriy Dligach, Yanjun Gao · PDF
Can You Spot the Virtual Patient (VP)? Expert Evaluation, Turing Test, Linguistic Analysis, and Semantic Similarity Analysis
Reyhaneh Hosseinpourkhoshkbari, Wei-chen Huang, Suvel Muttreja, Richard M. Golden · PDF
CancerGUIDE: Cancer Guideline Understanding via Internal Disagreement Estimation
Alyssa Unell, Noel C Codella, J. Samuel Preston, Peniel Argaw, Wen-wai Yim, Zelalem Gero, Cliff Wong, Eric Horvitz, Amanda Hall, Rachel Ruican Zhong, Jiachen Li, Shrey Jain, Mu Wei, Matthew P. Lungren, Hoifung Poon · PDF
ChatThero: A Language Agent for Recovery Support
Junda Wang, Zonghai Yao, Lingxi Li, Junhui Qian, Zhichao Yang, hong yu · PDF
Clinically Grounded Agent-based Report Evaluation: An Interpretable Metric for Radiology Report Generation
Radhika Dua, Young Joon Fred Kwon, Siddhant Dogra, Daniel Freedman, Diana Ruan, Motaz Nashawaty, Danielle Rigau, Daniel Alexander Alber, Kang Zhang, Kyunghyun Cho, Eric Karl Oermann · PDF
Count-Based Approaches Remain Strong: A Benchmark Against Transformer and LLM Pipelines on Structured EHR
Jifan Gao, Michael Rosenthal, Brian Wolpin, Simona Cristea · PDF
Demo: An Agentic Multi-Persona Generative AI System for Mental Health Companionship
Yuekai Wang · PDF
Demo: Building Maternal Health LLMs for Low-Resource Settings
Lyvia Lusiji, Stanslaus Mwongela, Francesco Piccinno, Jay Patel, Stephen Obonyo, Ellen Sebastian, Annalisa Pawlosky, Mfoniso Ukwak, Kelvin Ndambuki, Sylvia Mbugua, Sathy Rajasekharan, Dennis Troper · PDF
Demo: Can Visual Stimulation Enhance Reminiscence-Therapy Chatbot?
Sofiia Kononovych, Ostap Kilbasovych, Kateryna Bilyk, Yuliia Vistak, Zhiwen Fan, Junyuan Hong · PDF
Demo: Clinically Diverse Chest X-ray Synthesis via Cross-Modal Conditioning
Hassan Hamidi, Salamata Konate, Sara Hassani, Andrew Sellergren, Ali Sadeghi-Naini, Laleh Seyyed-Kalantari · PDF
Demo: Customizing Open-Source LLMs for Quantitative Medication Attribute Extraction across Heterogeneous EHR Systems
Zhe Fei, Mehmet Yigit Turali, Shreyas Rajesh, Xinyang Dai, Huyen Pham, Pavan S Holur, Yuhui Zhu, Larissa J. Mooney, Yih-Ing Hser, vwani Roychowdhury · PDF
Demo: Generative AI helps Radiotherapy Planning with User Preference
Riqiang Gao, Simon Arberet, Martin Kraus, Han Liu, Wilko F.A.R. Verbakel, Dorin Comaniciu, Florin-Cristian Ghesu, Ali Kamen · PDF
Demo: Guide-RAG: Evidence-Driven Corpus Curation for Retrieval-Augmented Generation in Long COVID
Philip DiGiacomo, Haoyang Wang, Jinrui Fang, Yan Leng, William Brode, Ying Ding · PDF
Demo: H2AI: A Framework for Experiential Learning and De-Risking Generative AI in Healthcare
Rohan M. Rani, Jacqueline Sandling, Francis Arellano, Nika Shroff, Yumin Gao, Raj Ratwani · PDF
Demo: Medbot, A Practical Tool Built by Clinicians for Clinicians to Leverage AI Agents to Enhance Clinical Practice
Li Lianjie Anthony, Antonio Bandeira, Niroj Bhandari, Rahul Gorijavolu, Aaron Teo Rui Kang, Jasmine Xiaojin Zhang, Alvin Aung Aung Hein, March I-Cheng Chen · PDF
Demo: Orchestrating Large Language Model Agents and Resources for Medical Deep Research
Yuan Li, Matthew Pan, Claire Liu, Hengjin Zhu, Xijing Wang, Yingtao Luo · PDF
Demo: PeerCoPilot: A Language Model-Powered Assistant for Behavioral Health Organizations
Gao Mo, Naveen Janaki Raman, Megan Chai, Cindy Peng, Shannon Pagdon, Nev Jones, Hong Shen, Fei Fang · PDF
Demo: PharmaData-Agent: A Specialized Agent for Pharmaceutical Data Analysis
Zihan Guan, Hanyin Wang, Zhongliang Zhou, Qiaohui Zhou, Peining Tao, Junshui Ma · PDF
Demo: Sanitizing Medical Documents with Differential Privacy using Large Language Models
Rushil Thareja, Gautam Gupta, Preslav Nakov, Praneeth Vepakomma, Nils Lukas · PDF
Demo: Statistically Significant Results on Biases and Errors of LLMs Do Not Guarantee Generalizable Results
Jonathan Liu, Haoling Qiu, Jonathan Lasko, Damianos Karakos, Mahsa Yarmohammadi, Mark Dredze · PDF
Demo: Streamlining Health Insurance Claims Verifications with AI-Blockchain Integration through AI+ROAX (Rod of Asclepius eXchange)
Li Lianjie Anthony, Kenneth Goh, Ansel Lim · PDF
Demo: Towards Generating Long-Sequence Sleep Heart Rate Signals with Conditional Diffusion
Manasa Mariam Mammen, PRIYANKA MARY MAMMEN, Emil Joswin · PDF
Detecting Synthetic Radiology Reports using Style Disentanglement
Tanvi Ranga, Arjun Ramesh Kaushik, Nalini K. Ratha · PDF
Editing with AI: How Doctors Refine LLM-Generated Answers to Patient Queries
Rahul Sharma, Pragnya Ramjee, Kaushik Murali, Mohit Jain · PDF
Enhancing Fine-Tuning-Free Clinical Reasoning via Test-Time Scaling
Ji Young Byun, Young-Jin Park, Navid Azizan, Rama Chellappa · PDF
Examining the Vulnerability of Multi-Agent Medical Systems to Human Interventions for Clinical Reasoning
Benjamin Liu, Dillon Mehta, Rishi Malhotra, Adam Zobian, Yong Ying Tan, Samir Chopra, Daniella Rand, Natalie Pang, Abhiram Gudimella, Raghav Thallapragada, Derek Jiu, Prisha Shah, Kevin Zhu · PDF
Explainable Insulin Pump Control with LLMs for Type 1 Diabetes
Maya Sarkar · PDF
Explaining Temporal Effects in Sepsis Prediction
Chaehyeon Kim, Eric Wong · PDF
FairGRPO: Towards Fair Reasoning Foundation Models for Clinical Diagnosis
Shiqi Dai, Wei Dai, Jiaee Cheong, Paul Pu Liang · PDF
Faithful or Just Plausible? Evaluating Faithfulness for Medical Reasoning in Closed-Source LLMs
Halimat Afolabi, Zainab Afolabi, Elizabeth Friel, Jude Roberts, Antonio Ji-Xu, Lloyd Chen, Egheosa Ogbomo, Emiliomo Imevbore, Phil Eneje, Wissal El Ouahidi, Aaron Sohal, Alisa Kennan, Shreya Srivastava, Anirudh Vairavan, Laura Napitu, Katie McClure · PDF
FedMentor: Domain-Aware Differential Privacy for Heterogeneous Federated LLMs in Mental Health
Nobin Sarwar, Shubhashis Roy Dipta · PDF
Foresight-England: Development of a National-Scale Generative AI Model of Patient Electronic Health Records for General Medical Event Prediction across the COVID-19 Pandemic
Simon Ellershaw, Christopher Tomlinson, Zeljko Kraljevic, Spiros Denaxas, Harry Hemingway, Anoop D. Shah, Richard Dobson · PDF
GRASP: Graph Reasoning Agents for Systems Pharmacology with Human-in-the-Loop
Omid Bazgir, Mohammad Jafarnejad, Vineeth Manthapuri, Ilia Rattsev · PDF
H-DDx: A Hierarchical Evaluation Framework for Differential Diagnosis
Seungseop Lim, Gibaeg Kim, Hyunkyung Lee, Wooseok Han, Jean Seo, Jaehyo Yoo, Eunho Yang · PDF
HealthSLM-Bench: Benchmarking Small Language Models for Mobile and Wearable Healthcare Monitoring
Xin Wang, Ting Dang, Xinyu Zhang, Vassilis Kostakos, Michael J. Witbrock, Hong Jia · PDF
Hearing Health in Home Healthcare: Leveraging LLMs for Illness Scoring and ALMs for Vocal Biomarker Extraction
Yu-Wen Chen, William Ho, Sasha M Vergez, Grace Flaherty, Pallavi Gupta, Zhihong Zhang, Maryam Zolnoori, Margaret V. McDonald, Maxim Topaz, Zoran Kostic, Julia Hirschberg · PDF
High-Fidelity Synthetic ECG Generation via Mel-Spectrogram Informed Diffusion Training
Zhuoyi Huang, Nutan Sahoo, Anamika Kumari, Girish Kumar, Kexuan Cai, Shixing Cao, Yue Kang, Tian Xia, Somya Chatterjee, Nicholas Hausman, Aidan Jay, Eric S. Rosenthal, Soundararajan Srinivasan, Sadid A. Hasan, Alex Fedorov, Sulaiman Vesal · PDF
Improvisational Reasoning with Vision-Language Models for Grounded Procedural Planning
Md Masudur Rahman, Yupeng Zhuo, Juan Wachs · PDF
K-Stain: Keypoint-Driven Correspondence for H\&E-to-IHC Virtual Staining
Sicheng Yang, Zhaohu Xing, Haipeng Zhou, Lei Zhu · PDF
Large Language Models as Medical Codes Selectors: a benchmark using the International Classification of Primary Care
Vinicius Anjos de Almeida, Vinicius de Camargo, Raquel Gomez Bravo, Kees van Boven, Egbert van der Haring, Marcelo Finger, Luis Fernandez Lopez · PDF
Leaps Beyond the Seen: Reinforced Reasoning Augmented Generation for Clinical Notes
Lo Pang-Yun Ting, Chengshuai Zhao, Yu-Hua Zeng, Yuan Jee Lim, Kun-Ta Chuang, huan liu · PDF
m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models
Xiaoke Huang, Juncheng Wu, Hui Liu, Xianfeng Tang, Yuyin Zhou · PDF
MedAgentGym: Training LLM Agents for Code-Based Medical Reasoning at Scale
Ran Xu, Yuchen Zhuang, Yishan Zhong, Yue Yu, Xiangru Tang, Hang Wu, May Dongmei Wang, Peifeng Ruan, Donghan Yang, Tao Wang, Guanghua Xiao, Carl Yang, Yang Xie, Wenqi Shi · PDF
MedBrowseComp: Benchmarking Medical Deep Research and Computer Use
Shan Chen, Pedro José Ferreira Moreira, Yuxin Xiao, Samuel Schmidgall, Jeremy L. Warner, Hugo Aerts, Thomas Hartvigsen, Jack Gallifant, Danielle Bitterman · PDF
MedGUIDE: Benchmarking Clinical Decision-Making in Large Language Models
Xiaomin Li, Mingye Gao, Yuexing Hao, Taoran Li, Guangya Wan, Zihan Wang, Yijun Wang, Xupeng Chen · PDF
Medical thinking with multiple images
Zonghai Yao, Benlu Wang, Yifan Zhang, Junda Wang, Iris Xia, Zhipeng Tang, Shuo Han, Feiyun Ouyang, Zhichao Yang, Arman Cohan, hong yu · PDF
MedVAL: Toward Expert-Level Medical Text Validation with Language Models
Asad Aali, Vasiliki Bikia, Maya Varma, Nicole Chiou, Sophie Ostmeier, Arnav Singhvi, Magdalini Paschali, Ashwin Kumar, Andrew Johnston, Karimar Amador-Martinez, Eduardo Juan Perez Guerrero, Paola Naovi Cruz Rivera, Sergios Gatidis, Christian Bluethgen, Eduardo Pontes Reis, Eddy D. Zandee van Rilland, Poonam Laxmappa Hosamani, Kevin R Keet, Minjoung Go, Evelyn Ling, David B. Larson, Curtis Langlotz, Roxana Daneshjou, Jason Hom, Sanmi Koyejo, Emily Alsentzer, Akshay S Chaudhari · PDF
MedVLThinker: Simple Baselines for Multimodal Medical Reasoning
Xiaoke Huang, Juncheng Wu, Hui Liu, Xianfeng Tang, Yuyin Zhou · PDF
Mending synthetic data with MAPS: Model Agnostic Post-hoc Synthetic Data Refinement Framework
Yan Li, Jennifer Bartell, Anders Krogh · PDF
Mind the Gap: Aligning Knowledge Bases with User Needs to Enhance Mental Health Retrieval
Amanda Chan, James Jiayu Liu, Kai He, Onno P. Kampman · PDF
Mixture-of-Experts Guided Multi-Omic Integration for Gastrointestinal Cancer Subtype Prediction
Sajib Acharjee Dip, Uddip Acharjee Shuvo, Dipanwita Mallick, Abrar Rahman Abir, Liqing Zhang · PDF
Modeling PTSD Trajectories with Conditional SVAEs and Synthetic Data Generation: Data-Efficient Prediction and Outcome-Specific Explainability
Mateus Guimarães Lima de Freitas, Alexander Rasgon, Shuangyu Li, Zhan Chen, João Paulo Abreu Maranhão, Dongjin Song, Yunyu Xiao, Ying Ding · PDF
Multi-Turn LLM Systems for Diagnostic Decision-Making: Considerations, Biases, and Challenges
Benjamin Liu, Sejong Kim, Drona Thoka, Varun Puttagunta, Kaylin Sheng, Mark Li, Thi Uyen Hanh Le, Sai Chidvilas Gudiboina, Ali Ugur, Kevin Zhu · PDF
Natural Language Grounded Reinforcement Learning for Clinical Decision-Making in Virtual Patient Simulations
Niyel Hassan, Benjamin Liu, Jason Tsai, Jeffrey K Jopling, Dana Lin, Edward Melcer, Cara Liebert · PDF
Ordinal Label-Distribution Learning with Constrained Asymmetric Priors for Imbalanced Retinal Grading
Nagur Shareef Shaik, Teja Krishna Cherukuri, Adnan Masood, Ehsan Adeli, Dong Hye Ye · PDF
PAME-AI: Patient Messaging Creation and Optimization using Agentic AI
Junjie Luo, Yihong Guo, Anqi Liu, Ritu Agarwal, Guodong Gordon Gao · PDF
Pandemic-Potential Viruses are a Blind Spot for Frontier Open-Source LLMs
Laura Luebbert, Yasha Ektefaie, Arya S. Rao, Colby Wilkason, Dolo Nosamiefan, Olivia Achonduh-Atijegbe, Harouna Soumare, Adefoye Precious Adebayo, Olufemi Olulaja, Judith Amadi, Nicholas Oyejide, Funmilayo Olayiwola, Etim Henshaw, Yusuf Okocha, Nkechinyere Nwachukwu, Elechi Friday Ewah, Sylvanus Okoro, Ebenezer Nwakpakpa, Peter Okokhere, Kelly Iraoyah, Joseph Okoeguale, Ireti Dada, Andy Burris, Karlie Zhao, Ellory Laning, Chase Van Amburg, Paul Cronan, Ben Fry, Christian Happi, Al Ozonoff, Pardis Sabeti · PDF
Physician Perceptions of Large Language Models in Clinical Practice: A Mixed-Methods Survey Study
Francis Arellano, Rohan M. Rani, Yumin Gao, Jacqueline Sandling, Emily Y Chen, Athena N. Nguyen, Deeya Garg, Zahra Ahmad, Nika Shroff, Katherine Barnes, Joshua M. Biro, Kristen E Miller · PDF
Position: Adjacent Technologies Are the Key Enablers of Scalable and Safe Clinical MLLM Deployment
Azmine Toushik Wasi, Md. Iqramul Hoque · PDF
Position: AI Will Transform Neuropsychology Through Mental Health Digital Twins for Dynamic Mental Health Care, Especially for ADHD
Neil Natarajan, Sruthi Viswanathan, Xavier Roberts-Gaal, Michelle Marie Martel · PDF
Position: AI-Driven Risk Stratification is Essential for Affordable Early Detection of Cancer
Asif Khan, Duncan Terell Forster, Moshir Harsh, Daniel Ritter, Chunlei Zheng, Rose Orenbuch, Abdullah Kuziez, Artem Gazizov, Debora Susan Marks, Nathanael R. Fillmore, Chris Sander · PDF
Position: CARE-RAG: Clinical Assessment and Reasoning in RAG
Deepthi Potluri, Aby Mammen Mathew, Alexander L. Rasgon, Jeffrey B Dewitt, Yide Hao, Joseph C McGrath, Junyuan Hong, Charles Barnet Nemeroff, Greg Muller, Ying Ding · PDF
Position: Communities of Practice can be used to Address Challenges to Regulation and Governance of Generative AI in South East Asian Countries
Li Lianjie Anthony, Premikha M, Ying Tze Siow, Taufeeq Wahab, Muhamad Noor Alfarizal, Ivan Koh, VIVEK JASON JAYARAJ, Clive Tan · PDF
Position: Ophthalmology as a Lens for Trustworthy GenAI in Europe---Uncertainty-Aware AI under the EU AI Act
Mariya Erokhina, Achref Doula, Lukas Bisorca-Gassendorf, Alejandro Sanchez Guinea · PDF
Position: Restricted Release of Advanced Biological Models Safeguards Biosecurity
Jonathan Feldman, Tal Feldman · PDF
Position: Specialty Society-Led Meta-Governance is Essential to Responsible Implementation of Generative AI in Cardiovascular Care
Yumin Gao, Jacqueline Sandling, Francis Arellano, Zahra Ahmad, Rohan M. Rani, Deeya Garg, Emily Y Chen, Aamir Javaid, Francoise Marvel, Seth Martin, Raj Ratwani, Charles German · PDF
Position: The Pitfalls of Over-Alignment: Overly Caution Health-Related Responses From LLMs are Unethical and Dangerous
Wenqi Marshall Guo, Yiyang Du, Heidi J.S. Tworek, Shan Du · PDF
Position: Thematic Analysis of Unstructured Clinical Transcripts with Large Language Models
Seungjun Yi, Joakim Nguyen, Terence Lim, Andrew Well, Joseph Skrovan, Mehak Beri, YongGeon Lee, Kavita Radhakrishnan, Liu Leqi, Mia Markey, Ying Ding · PDF
PRISM: Physician Rules Integrated with Small large language Models for probable diagnoses associated with Abdominal Pain
Gautam Ahuja, Ayush Agarwal, Hara Prasad Mishra, Samagra Agrawal, Rik Ganguly, Zonunmawia, Akshay Sharma, Vatsal Batra, Bableen Kaur, Siddhant Poudyal, Himani Balutia, Sagarika, Sanjana Ahuja, Kedar Natarajan, Partha Pratim Das, Ramesh Jain, Partha Pratim Chakrabarti, Anurag Agrawal, Govind Makharia, Rintu Kutum · PDF
QRad: Enhancing Radiology Report Generation by Captioning-to-VQA Reframing
Ying Jin, Noel C Codella, Yanbo Xu, Yu Gu, Mu Wei, Haoquan Fang, Thomas Lin, Paul Vozila, Jenq-Neng Hwang · PDF
Reliable or Risky? Assessing Diffusion Models for Biomedical Data Generation
Abdalrahman Alblwi, Qile Wang, Norbert Zolek, Matthew Louis Mauriello, Kenneth Barner · PDF
Robust or Suggestible? Exploring Non-Clinical Induction in LLM Drug-Safety Decisions
Siying Liu, Shisheng Zhang, Indu Bala · PDF
RPRO: Ranked Preference Reinforcement Optimization for Enhancing Medical QA and Diagnostic Reasoning
Chia-Hsuan Hsu, Jun-En Ding, Hsin-Ling Hsu, Chun-Chieh Liao, Fang-Ming Hung, Feng Liu · PDF
Scalable Whole-Slide Vision-Language Modeling with Learned Token Pruning
Ali Kerem Bozkurt, Baris Cem Bakay, Ibrahim Kulac, Cigdem Gunduz-Demir, Erkut Erdem, Aykut Erdem · PDF
SecureRAG: End-to-End Secure Retrieval-Augmented Generation
Amina Bassit, Vishnu Boddeti · PDF
Shallow Robustness, Deep Vulnerabilities: Multi-Turn Evaluation of Medical LLMs
Blazej Manczak, Eric Lin, Francisco Eiras, James O' Neill, Vaikkunth Mugunthan · PDF
Stabilizing Reasoning in Medical LLMs with Continued Pretraining and Reasoning Preference Optimization
Wataru Kawakami, Keita Suzuki, Junichiro Iwasawa · PDF
SynLLM: A Comparative Analysis of Large Language Models for Medical Tabular Synthetic Data Generation via Prompt Engineering
Arshia Ilaty, Hossein Shirazi, Hajar Homayouni · PDF
The Biased Oracle: Assessing LLMs’ Understandability and Empathy in Medical Diagnoses
Jianzhou Yao, Shunchang Liu, Guillaume Drui, Rikard Pettersson, Alessandro Blasimme, Sara Kijewski · PDF
The Energy to Say No: Pre-Generation Abstention for Safety-Critical Medical RAG
Ravi Shankar, Sheng Fung Wong, Lin Li, Magdalena Bachmann, Alex Silverthorne, Beth Albert, Gabriel Davis Jones · PDF
Towards Application Aligned Synthetic Surgical Image Synthesis
Danush Kumar Venkatesh, Stefanie Speidel · PDF
Towards Memory-Efficient Foundation Models in Medical Imaging: A Federated Learning and Knowledge Distillation Approach
Afsaneh Mahanipour, Abdullah Imran, Hana Khamfroush · PDF
Towards Synthesizing Normative Data for Cognitive Assessments Using Generative Multimodal Large Language Models
Victoria Yan, Honor Chotkowski, Fengran Wang, Xinhui Li, Carl Yang, Jiaying Lu, Runze Yan, Xiao Hu, Alex Fedorov · PDF
Traj-CoA: Patient Trajectory Modeling via Chain-of-Agents for Lung Cancer Risk Prediction
Sihang Zeng, Yujuan Fu, Sitong Zhou, Zixuan Yu, Lucas Jing Liu, Jun Wen, Matthew Thompson, Ruth Etzioni, Meliha Yetisgen · PDF
Unanchoring the Mind: AI-Guided Counterfactual Reasoning for Rare Disease Diagnosis
Yuting Yan, Yinghao Fu, Wendi Ren, Shuang Li · PDF
Uncovering Intervention Opportunities for Suicide Prevention with Language Model Assistants
Jaspreet Ranjit, Hyundong Justin Cho, Claire J. Smerdon, Yoonsoo Nam, Myles Phung, Jonathan May, John R. Blosnich, Swabha Swayamdipta · PDF
Using LLM-as-a-Judge/Jury to Advance Scalable, Clinically-Validated Safety Evaluations of Model Responses to Users Demonstrating Psychosis
May Lynn Reese, Markela Zeneli, Mindy Ng, Jacob Haimes, Andreea Damien, Elizabeth C. Stade · PDF
Vision-Language Reasoning for Burn Depth Assessment with Structured Diagnostic Hypotheses
Md Masudur Rahman, Mohamed El Masry, Kristo Nuutila, Gayle Gordillo, Juan Wachs · PDF
When the Domain Expert Has No Time and the LLM Developer Has No Clinical Expertise: Real-World Lessons from LLM Co-Design in a Safety-Net Hospital
Avni Kothari, Patrick Vossler, Jean Digitale, Seyed Mohammad Forouzannia, Melanie F. Molina, Elise Rosenberg, Michele Lee, Jenee Bryant, James D. Marks, Lucas Zier, Jean Feng · PDF
Zero-Shot Large Language Model Agents for Fully Automated Radiotherapy Treatment Planning
Dongrong Yang, Xin Wu, Yibo Xie, Xinyi Li, Qiuwen Wu, Jackie Wu, Yang Sheng · PDF

Accepted papers (99)

☆3D Brain MRI Generation with a Clinically-Conditioned VAE-GAN and Diffusion-Driven Feature Sampling

☆An Interactive Framework for Generating Clinical Data with Human Feedback

☆Application of Whisper in Clinical Practice: the Post-Stroke Speech Assessment during a Naming Task

☆ArtifactGen: Benchmarking WGAN-GP vs Diffusion for Label-Aware EEG Artifact Synthesis

☆Automatic Correction of AI Reports using Fact-Checking Model-guided LLMs

☆Balancing Safety and Helpfulness in Healthcare AI Assistants through Iterative Preference Alignment

☆Beyond Distillation: Pushing the Limits of Medical LLM Reasoning with Minimalist Rule-Based RL

☆Beyond Overall Accuracy: A Psychometric Deep Dive into the Topic-Specific Medical Capabilities of 80 Large Language Models

☆Bridging Graph and State-Space Modeling for Intensive Care Unit Length of Stay Prediction

☆Brittleness and Promise: Knowledge Graph–Based Reward Modeling for Diagnostic Reasoning

☆Can You Spot the Virtual Patient (VP)? Expert Evaluation, Turing Test, Linguistic Analysis, and Semantic Similarity Analysis

☆CancerGUIDE: Cancer Guideline Understanding via Internal Disagreement Estimation

☆ChatThero: A Language Agent for Recovery Support

☆Clinically Grounded Agent-based Report Evaluation: An Interpretable Metric for Radiology Report Generation

☆Count-Based Approaches Remain Strong: A Benchmark Against Transformer and LLM Pipelines on Structured EHR

☆Demo: An Agentic Multi-Persona Generative AI System for Mental Health Companionship

☆Demo: Building Maternal Health LLMs for Low-Resource Settings

☆Demo: Can Visual Stimulation Enhance Reminiscence-Therapy Chatbot?

☆Demo: Clinically Diverse Chest X-ray Synthesis via Cross-Modal Conditioning

☆Demo: Customizing Open-Source LLMs for Quantitative Medication Attribute Extraction across Heterogeneous EHR Systems

☆Demo: Generative AI helps Radiotherapy Planning with User Preference

☆Demo: Guide-RAG: Evidence-Driven Corpus Curation for Retrieval-Augmented Generation in Long COVID

☆Demo: H2AI: A Framework for Experiential Learning and De-Risking Generative AI in Healthcare

☆Demo: Medbot, A Practical Tool Built by Clinicians for Clinicians to Leverage AI Agents to Enhance Clinical Practice

☆Demo: Orchestrating Large Language Model Agents and Resources for Medical Deep Research

☆Demo: PeerCoPilot: A Language Model-Powered Assistant for Behavioral Health Organizations

☆Demo: PharmaData-Agent: A Specialized Agent for Pharmaceutical Data Analysis

☆Demo: Sanitizing Medical Documents with Differential Privacy using Large Language Models

☆Demo: Statistically Significant Results on Biases and Errors of LLMs Do Not Guarantee Generalizable Results

☆Demo: Streamlining Health Insurance Claims Verifications with AI-Blockchain Integration through AI+ROAX (Rod of Asclepius eXchange)

☆Demo: Towards Generating Long-Sequence Sleep Heart Rate Signals with Conditional Diffusion

☆Detecting Synthetic Radiology Reports using Style Disentanglement

☆Editing with AI: How Doctors Refine LLM-Generated Answers to Patient Queries

☆Enhancing Fine-Tuning-Free Clinical Reasoning via Test-Time Scaling

☆Examining the Vulnerability of Multi-Agent Medical Systems to Human Interventions for Clinical Reasoning

☆Explainable Insulin Pump Control with LLMs for Type 1 Diabetes

☆Explaining Temporal Effects in Sepsis Prediction

☆FairGRPO: Towards Fair Reasoning Foundation Models for Clinical Diagnosis

☆Faithful or Just Plausible? Evaluating Faithfulness for Medical Reasoning in Closed-Source LLMs

☆FedMentor: Domain-Aware Differential Privacy for Heterogeneous Federated LLMs in Mental Health

☆Foresight-England: Development of a National-Scale Generative AI Model of Patient Electronic Health Records for General Medical Event Prediction across the COVID-19 Pandemic

☆GRASP: Graph Reasoning Agents for Systems Pharmacology with Human-in-the-Loop

☆H-DDx: A Hierarchical Evaluation Framework for Differential Diagnosis

☆HealthSLM-Bench: Benchmarking Small Language Models for Mobile and Wearable Healthcare Monitoring

☆Hearing Health in Home Healthcare: Leveraging LLMs for Illness Scoring and ALMs for Vocal Biomarker Extraction

☆High-Fidelity Synthetic ECG Generation via Mel-Spectrogram Informed Diffusion Training

☆Improvisational Reasoning with Vision-Language Models for Grounded Procedural Planning

☆K-Stain: Keypoint-Driven Correspondence for H\&E-to-IHC Virtual Staining

☆Large Language Models as Medical Codes Selectors: a benchmark using the International Classification of Primary Care

☆Leaps Beyond the Seen: Reinforced Reasoning Augmented Generation for Clinical Notes

☆m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models

☆MedAgentGym: Training LLM Agents for Code-Based Medical Reasoning at Scale

☆MedBrowseComp: Benchmarking Medical Deep Research and Computer Use

☆MedGUIDE: Benchmarking Clinical Decision-Making in Large Language Models

☆Medical thinking with multiple images

☆MedVAL: Toward Expert-Level Medical Text Validation with Language Models

☆MedVLThinker: Simple Baselines for Multimodal Medical Reasoning

☆Mending synthetic data with MAPS: Model Agnostic Post-hoc Synthetic Data Refinement Framework

☆Mind the Gap: Aligning Knowledge Bases with User Needs to Enhance Mental Health Retrieval

☆Mixture-of-Experts Guided Multi-Omic Integration for Gastrointestinal Cancer Subtype Prediction

☆Modeling PTSD Trajectories with Conditional SVAEs and Synthetic Data Generation: Data-Efficient Prediction and Outcome-Specific Explainability

☆Multi-Turn LLM Systems for Diagnostic Decision-Making: Considerations, Biases, and Challenges

☆Natural Language Grounded Reinforcement Learning for Clinical Decision-Making in Virtual Patient Simulations

☆Ordinal Label-Distribution Learning with Constrained Asymmetric Priors for Imbalanced Retinal Grading

☆PAME-AI: Patient Messaging Creation and Optimization using Agentic AI

☆Pandemic-Potential Viruses are a Blind Spot for Frontier Open-Source LLMs

☆Physician Perceptions of Large Language Models in Clinical Practice: A Mixed-Methods Survey Study

☆Position: Adjacent Technologies Are the Key Enablers of Scalable and Safe Clinical MLLM Deployment

☆Position: AI Will Transform Neuropsychology Through Mental Health Digital Twins for Dynamic Mental Health Care, Especially for ADHD

☆Position: AI-Driven Risk Stratification is Essential for Affordable Early Detection of Cancer

☆Position: CARE-RAG: Clinical Assessment and Reasoning in RAG

☆Position: Communities of Practice can be used to Address Challenges to Regulation and Governance of Generative AI in South East Asian Countries

☆Position: Ophthalmology as a Lens for Trustworthy GenAI in Europe---Uncertainty-Aware AI under the EU AI Act

☆Position: Restricted Release of Advanced Biological Models Safeguards Biosecurity

☆Position: Specialty Society-Led Meta-Governance is Essential to Responsible Implementation of Generative AI in Cardiovascular Care

☆Position: The Pitfalls of Over-Alignment: Overly Caution Health-Related Responses From LLMs are Unethical and Dangerous

☆Position: Thematic Analysis of Unstructured Clinical Transcripts with Large Language Models

☆PRISM: Physician Rules Integrated with Small large language Models for probable diagnoses associated with Abdominal Pain

☆QRad: Enhancing Radiology Report Generation by Captioning-to-VQA Reframing

3D Brain MRI Generation with a Clinically-Conditioned VAE-GAN and Diffusion-Driven Feature Sampling

An Interactive Framework for Generating Clinical Data with Human Feedback

Application of Whisper in Clinical Practice: the Post-Stroke Speech Assessment during a Naming Task

ArtifactGen: Benchmarking WGAN-GP vs Diffusion for Label-Aware EEG Artifact Synthesis

Automatic Correction of AI Reports using Fact-Checking Model-guided LLMs

Balancing Safety and Helpfulness in Healthcare AI Assistants through Iterative Preference Alignment

Beyond Distillation: Pushing the Limits of Medical LLM Reasoning with Minimalist Rule-Based RL

Beyond Overall Accuracy: A Psychometric Deep Dive into the Topic-Specific Medical Capabilities of 80 Large Language Models

Bridging Graph and State-Space Modeling for Intensive Care Unit Length of Stay Prediction

Brittleness and Promise: Knowledge Graph–Based Reward Modeling for Diagnostic Reasoning

Can You Spot the Virtual Patient (VP)? Expert Evaluation, Turing Test, Linguistic Analysis, and Semantic Similarity Analysis

CancerGUIDE: Cancer Guideline Understanding via Internal Disagreement Estimation

ChatThero: A Language Agent for Recovery Support

Clinically Grounded Agent-based Report Evaluation: An Interpretable Metric for Radiology Report Generation

Count-Based Approaches Remain Strong: A Benchmark Against Transformer and LLM Pipelines on Structured EHR

Demo: An Agentic Multi-Persona Generative AI System for Mental Health Companionship

Demo: Building Maternal Health LLMs for Low-Resource Settings

Demo: Can Visual Stimulation Enhance Reminiscence-Therapy Chatbot?

Demo: Clinically Diverse Chest X-ray Synthesis via Cross-Modal Conditioning

Demo: Customizing Open-Source LLMs for Quantitative Medication Attribute Extraction across Heterogeneous EHR Systems

Demo: Generative AI helps Radiotherapy Planning with User Preference

Demo: Guide-RAG: Evidence-Driven Corpus Curation for Retrieval-Augmented Generation in Long COVID

Demo: H2AI: A Framework for Experiential Learning and De-Risking Generative AI in Healthcare

Demo: Medbot, A Practical Tool Built by Clinicians for Clinicians to Leverage AI Agents to Enhance Clinical Practice

Demo: Orchestrating Large Language Model Agents and Resources for Medical Deep Research

Demo: PeerCoPilot: A Language Model-Powered Assistant for Behavioral Health Organizations

Demo: PharmaData-Agent: A Specialized Agent for Pharmaceutical Data Analysis

Demo: Sanitizing Medical Documents with Differential Privacy using Large Language Models

Demo: Statistically Significant Results on Biases and Errors of LLMs Do Not Guarantee Generalizable Results

Demo: Streamlining Health Insurance Claims Verifications with AI-Blockchain Integration through AI+ROAX (Rod of Asclepius eXchange)

Demo: Towards Generating Long-Sequence Sleep Heart Rate Signals with Conditional Diffusion

Detecting Synthetic Radiology Reports using Style Disentanglement

Editing with AI: How Doctors Refine LLM-Generated Answers to Patient Queries

Enhancing Fine-Tuning-Free Clinical Reasoning via Test-Time Scaling

Examining the Vulnerability of Multi-Agent Medical Systems to Human Interventions for Clinical Reasoning

Explainable Insulin Pump Control with LLMs for Type 1 Diabetes

Explaining Temporal Effects in Sepsis Prediction

FairGRPO: Towards Fair Reasoning Foundation Models for Clinical Diagnosis

Faithful or Just Plausible? Evaluating Faithfulness for Medical Reasoning in Closed-Source LLMs

FedMentor: Domain-Aware Differential Privacy for Heterogeneous Federated LLMs in Mental Health

Foresight-England: Development of a National-Scale Generative AI Model of Patient Electronic Health Records for General Medical Event Prediction across the COVID-19 Pandemic

GRASP: Graph Reasoning Agents for Systems Pharmacology with Human-in-the-Loop

H-DDx: A Hierarchical Evaluation Framework for Differential Diagnosis

HealthSLM-Bench: Benchmarking Small Language Models for Mobile and Wearable Healthcare Monitoring

Hearing Health in Home Healthcare: Leveraging LLMs for Illness Scoring and ALMs for Vocal Biomarker Extraction

High-Fidelity Synthetic ECG Generation via Mel-Spectrogram Informed Diffusion Training

Improvisational Reasoning with Vision-Language Models for Grounded Procedural Planning

K-Stain: Keypoint-Driven Correspondence for H\&E-to-IHC Virtual Staining

Large Language Models as Medical Codes Selectors: a benchmark using the International Classification of Primary Care

Leaps Beyond the Seen: Reinforced Reasoning Augmented Generation for Clinical Notes

m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models

MedAgentGym: Training LLM Agents for Code-Based Medical Reasoning at Scale

MedBrowseComp: Benchmarking Medical Deep Research and Computer Use

MedGUIDE: Benchmarking Clinical Decision-Making in Large Language Models

Medical thinking with multiple images

MedVAL: Toward Expert-Level Medical Text Validation with Language Models

MedVLThinker: Simple Baselines for Multimodal Medical Reasoning

Mending synthetic data with MAPS: Model Agnostic Post-hoc Synthetic Data Refinement Framework

Mind the Gap: Aligning Knowledge Bases with User Needs to Enhance Mental Health Retrieval

Mixture-of-Experts Guided Multi-Omic Integration for Gastrointestinal Cancer Subtype Prediction

Modeling PTSD Trajectories with Conditional SVAEs and Synthetic Data Generation: Data-Efficient Prediction and Outcome-Specific Explainability

Multi-Turn LLM Systems for Diagnostic Decision-Making: Considerations, Biases, and Challenges

Natural Language Grounded Reinforcement Learning for Clinical Decision-Making in Virtual Patient Simulations

Ordinal Label-Distribution Learning with Constrained Asymmetric Priors for Imbalanced Retinal Grading

PAME-AI: Patient Messaging Creation and Optimization using Agentic AI

Pandemic-Potential Viruses are a Blind Spot for Frontier Open-Source LLMs

Physician Perceptions of Large Language Models in Clinical Practice: A Mixed-Methods Survey Study

Position: Adjacent Technologies Are the Key Enablers of Scalable and Safe Clinical MLLM Deployment

Position: AI Will Transform Neuropsychology Through Mental Health Digital Twins for Dynamic Mental Health Care, Especially for ADHD

Position: AI-Driven Risk Stratification is Essential for Affordable Early Detection of Cancer

Position: CARE-RAG: Clinical Assessment and Reasoning in RAG

Position: Communities of Practice can be used to Address Challenges to Regulation and Governance of Generative AI in South East Asian Countries

Position: Ophthalmology as a Lens for Trustworthy GenAI in Europe---Uncertainty-Aware AI under the EU AI Act

Position: Restricted Release of Advanced Biological Models Safeguards Biosecurity

Position: Specialty Society-Led Meta-Governance is Essential to Responsible Implementation of Generative AI in Cardiovascular Care

Position: The Pitfalls of Over-Alignment: Overly Caution Health-Related Responses From LLMs are Unethical and Dangerous

Position: Thematic Analysis of Unstructured Clinical Transcripts with Large Language Models

PRISM: Physician Rules Integrated with Small large language Models for probable diagnoses associated with Abdominal Pain

QRad: Enhancing Radiology Report Generation by Captioning-to-VQA Reframing

Reliable or Risky? Assessing Diffusion Models for Biomedical Data Generation