ICML 2024PastLarge language modelsAgentsSafety & alignment

Trustworthy Multi-modal Foundation Models and AI Agents (TiFA)

ICML 2024 TiFA Workshop

Official website ↗OpenReview venue ↗See all ICML workshops →✎ Edit this entry

Submission deadline: May 31, 2024, 12:00 UTC
imported from OpenReview — check the website for extensions
Submission portal: OpenReview
Notes: Topics were auto-suggested and may be imprecise — edits welcome.

Accepted papers (19)

Fetched from OpenReview (v2) on 2026-06-10.

Bias Begets Bias: the Impact of Biased Embeddings on Diffusion Models
Sahil Kuchlous, Marvin Li, Jeffrey George Wang · PDF
Borrowing Treasures from Neighbors: In-Context Learning for Multimodal Learning with Missing Modalities and Data Scarcity
zhuo zhi, Ziquan Liu, Moe Elbadawi, Adam Daneshmend, Mine Orlu, Abdul W Basit, Andreas Demosthenous, Miguel R. D. Rodrigues · PDF
Can Editing LLMs Inject Harm?
Canyu Chen, Baixiang Huang, Zekun Li, Zhaorun Chen, Shiyang Lai, Xiongxiao Xu, Jia-Chen Gu, Jindong Gu, Huaxiu Yao, Chaowei Xiao, Xifeng Yan, William Yang Wang, Philip Torr, Dawn Song, Kai Shu · PDF
Chained Tuning Leads to Biased Forgetting
Megan Ung, Alicia Yi Sun, Samuel Bell, Levent Sagun, Adina Williams · PDF
Decomposed evaluations of geographic disparities in text-to-image models
Abhishek Sureddy, Dishant Padalia, Nandhinee Periyakaruppan, Oindrila Saha, Adina Williams, Adriana Romero-Soriano, Megan Richards, Polina Kirichenko, Melissa Hall · PDF
Games for AI-Control: Models of Safety Evaluations of AI Deployment Protocols
Charlie Griffin, Buck Shlegeris, Alessandro Abate · PDF
MaPPing Your Model: Assessing the Impact of Adversarial Attacks on LLM-based Programming Assistants
John Heibel, Daniel Lowd · PDF
Models That Prove Their Own Correctness
Noga Amit, Shafi Goldwasser, Orr Paradise, Guy N. Rothblum · PDF
On the Difficulty of Faithful Chain-of-Thought Reasoning in Large Language Models
Sree Harsha Tanneru, Dan Ley, Chirag Agarwal, Himabindu Lakkaraju · PDF
On the Multi-modal Vulnerability of Diffusion Models
Dingcheng Yang, Yang Bai, Xiaojun Jia, Yang Liu, Xiaochun Cao, Wenjian Yu · PDF
Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models
Zhenyang Ni, Rui Ye, Yuxi Wei, Zhen Xiang, Yanfeng Wang, Siheng Chen · PDF
Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques
Rishika Bhagwatkar, Shravan Nayak, Reza Bayat, Alexis Roger, Daniel Z Kaplan, Pouya Bashivan, Irina Rish · PDF
TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution
Wenyue Hua, Xianjun Yang, Mingyu Jin, Zelong Li, Wei Cheng, Ruixiang Tang, Yongfeng Zhang · PDF
Unfamiliar Finetuning Examples Control How Language Models Hallucinate
Katie Kang, Eric Wallace, Claire Tomlin, Aviral Kumar, Sergey Levine · PDF
VACoDe: Visual Augmented Contrastive Decoding
Sihyeon Kim, Boryeong Cho, Sangmin Bae, Sumyeong Ahn, Se-Young Yun · PDF
Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs
Jinmin Li, Kuofeng Gao, Yang Bai, Jingyun Zhang, Shu-Tao Xia · PDF
Wasserstein Modality Alignment Makes Your Multimodal Transformer More Robust
zhuo zhi, Ziquan Liu, Qiangqiang Wu, Miguel R. D. Rodrigues · PDF
WebCanvas: Benchmarking Web Agents in Online Environments
Yichen Pan, Dehan Kong, Sida Zhou, Cheng Cui, Yifei Leng, Bing Jiang, Hangyu Liu, Yanyi Shang, Shuyan Zhou, Tongshuang Wu, Zhengyang Wu · PDF
Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?
Rylan Schaeffer, Hailey Schoelkopf, Brando Miranda, Gabriel Mukobi, Varun Madan, Adam Ibrahim, Herbie Bradley, Stella Biderman, Sanmi Koyejo · PDF

Accepted papers (19)

☆Bias Begets Bias: the Impact of Biased Embeddings on Diffusion Models

☆Borrowing Treasures from Neighbors: In-Context Learning for Multimodal Learning with Missing Modalities and Data Scarcity

☆Can Editing LLMs Inject Harm?

☆Chained Tuning Leads to Biased Forgetting

☆Decomposed evaluations of geographic disparities in text-to-image models

☆Games for AI-Control: Models of Safety Evaluations of AI Deployment Protocols

☆MaPPing Your Model: Assessing the Impact of Adversarial Attacks on LLM-based Programming Assistants

☆Models That Prove Their Own Correctness

☆On the Difficulty of Faithful Chain-of-Thought Reasoning in Large Language Models

☆On the Multi-modal Vulnerability of Diffusion Models

☆Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models

☆Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques

☆TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution

☆Unfamiliar Finetuning Examples Control How Language Models Hallucinate

☆VACoDe: Visual Augmented Contrastive Decoding

☆Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs

☆Wasserstein Modality Alignment Makes Your Multimodal Transformer More Robust

☆WebCanvas: Benchmarking Web Agents in Online Environments

☆Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?

Bias Begets Bias: the Impact of Biased Embeddings on Diffusion Models

Borrowing Treasures from Neighbors: In-Context Learning for Multimodal Learning with Missing Modalities and Data Scarcity

Can Editing LLMs Inject Harm?

Chained Tuning Leads to Biased Forgetting

Decomposed evaluations of geographic disparities in text-to-image models

Games for AI-Control: Models of Safety Evaluations of AI Deployment Protocols

MaPPing Your Model: Assessing the Impact of Adversarial Attacks on LLM-based Programming Assistants

Models That Prove Their Own Correctness

On the Difficulty of Faithful Chain-of-Thought Reasoning in Large Language Models

On the Multi-modal Vulnerability of Diffusion Models

Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models

Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques

TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution

Unfamiliar Finetuning Examples Control How Language Models Hallucinate

VACoDe: Visual Augmented Contrastive Decoding

Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs

Wasserstein Modality Alignment Makes Your Multimodal Transformer More Robust

WebCanvas: Benchmarking Web Agents in Online Environments

Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?