ICML 2024 Past Large language modelsTheory

ICML 2024 Workshop on Theoretical Foundations of Foundation Models

TF2M 2024

Submission deadline
Jun 1, 2024, 11:59 UTC
imported from OpenReview — check the website for extensions
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (58)

Fetched from OpenReview (v2) on 2026-06-10.

  1. A deeper look at depth pruning of LLMs

    Shoaib Ahmed Siddiqui, Xin Dong, Greg Heinrich, Thomas Breuel, Jan Kautz, David Krueger, Pavlo Molchanov · PDF
  2. A Theoretical Understanding of Self-Correction through In-context Alignment

    Yifei Wang, Yuyang Wu, Zeming Wei, Stefanie Jegelka, Yisen Wang · PDF
  3. Active Preference Optimization for Sample Efficient RLHF

    Nirjhar Das, Souradip Chakraborty, Aldo Pacchiano, Sayak Ray Chowdhury · PDF
  4. Attention Is All You Need But You Don’t Need All Of It For Inference of Large Language Models

    Georgy Tyukin, Gbetondji Jean-Sebastien Dovonon, Jean Kaddour, Pasquale Minervini · PDF
  5. Beyond Model Collapse: Scaling Up with Synthesized Data Requires Reinforcement

    Yunzhen Feng, Elvis Dohmatob, Pu Yang, Francois Charton, Julia Kempe · PDF
  6. Decoding-Time Language Model Alignment with Multiple Objectives

    Ruizhe Shi, Yifang Chen, Yushi Hu, Alisa Liu, Hannaneh Hajishirzi, Noah A. Smith, Simon Shaolei Du · PDF
  7. Detrimental Memories in Transfer Learning

    Amal Alnouri, Timothy J Wroge, Bilal Alsallakh · PDF
  8. Do LLM Agents Have Regret? A Case Study in Online Learning and Games

    Chanwoo Park, Xiangyu Liu, Asuman E. Ozdaglar, Kaiqing Zhang · PDF
  9. Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers

    Yibo Jiang, Goutham Rajendran, Pradeep Kumar Ravikumar, Bryon Aragam · PDF
  10. Efficient Document Ranking with Learnable Late Interactions

    Himanshu Jain, Ziwei Ji, Sashank J. Reddi, Ankit Singh Rawat, Felix Yu, Aditya Krishna Menon, Sadeep Jayasumana · PDF
  11. Fast Machine Unlearning via Robust Training

    Youssef Allouah, Joshua Kazdan, Rachid Guerraoui, Sanmi Koyejo · PDF
  12. Fine-Tuning Large Language Models with User-Level Differential Privacy

    Zachary Charles, Arun Ganesh, Ryan McKenna, Hugh Brendan McMahan, Nicole Elyse Mitchell, Krishna Pillutla, J Keith Rush · PDF
  13. Fundamental Limits of Prompt Compression: A Rate-Distortion Framework for Black-Box Language Models

    Adway Girish, Alliot Nagle, Ashok Vardhan Makkuva, Marco Bondaschi, Michael Gastpar, Hyeji Kim · PDF
  14. Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment

    Jiaxiang Li, Siliang Zeng, Hoi To Wai, Chenliang Li, Alfredo Garcia, Mingyi Hong · PDF
  15. Hallmarks of Optimization Trajectories in Neural Networks and LLMs: Directional Exploration and Redundancy

    Sidak Pal Singh, Bobby He, Thomas Hofmann, Bernhard Schölkopf · PDF
  16. How Do Nonlinear Transformers Acquire Generalization-Guaranteed CoT Ability?

    Hongkang Li, Meng Wang, Songtao Lu, Xiaodong Cui, Pin-Yu Chen · PDF
  17. How Do Transformers Fill in the Blanks? A Case Study on Matrix Completion

    Pulkit Gopalani, Ekdeep Singh Lubana, Wei Hu · PDF
  18. How Transformers Learn Diverse Attention Correlations in Masked Vision Pretraining

    Yu Huang, Zixin Wen, Yuejie Chi, Yingbin Liang · PDF
  19. How Transformers Utilize Multi-Head Attention in In-Context Learning? A Case Study on Sparse Linear Regression

    Xingwu Chen, Lei Zhao, Difan Zou · PDF
  20. Implementability of Information Elicitation Mechanisms with Pre-Trained Language Models

    Zachary Robertson, Hannah Cha, Andrew Sheha, Sanmi Koyejo · PDF
  21. Implicit Optimization Bias of Next-token Prediction in Linear Models

    Christos Thrampoulidis · PDF
  22. Implicit Regularization of Sharpness-Aware Minimization for Scale-Invariant Problems

    Bingcong Li, Liang Zhang, Niao He · PDF
  23. ImportanceWeighted Multi-Draft Speculative Sampling

    Ashish J Khisti, Arash Behravesh, Hassan Dbouk, Arash Behboodi, Roland Memisevic, Christos Louizos · PDF
  24. In-Context Learning from Training on Unstructured Data: The Role of Co-Occurrence, Positional Information, and Training Data Structure

    Kevin Christian Wibisono, Yixin Wang · PDF
  25. In-Context Learning with Representations: Contextual Generalization of Trained Transformers

    Tong Yang, Yu Huang, Yingbin Liang, Yuejie Chi · PDF
  26. Local to Global: Learning Dynamics and Effect of Initialization for Transformers

    Ashok Vardhan Makkuva, Marco Bondaschi, Chanakya Ekbote, Adway Girish, Alliot Nagle, Hyeji Kim, Michael Gastpar · PDF
  27. Meta-optimization for Deep Learning via Nonstochastic Control

    Xinyi Chen, Evan Dogariu, Zhou Lu, Elad Hazan · PDF
  28. Mission Impossible: A Statistical Perspective on Jailbreaking LLMs

    Jingtong Su, Julia Kempe, Karen Ullrich · PDF
  29. Modeling the Plurality of Human Preferences via Ideal Points

    Daiwei Chen, Yi Chen, Aniket Rege, Ramya Korlakai Vinayak · PDF
  30. Models That Prove Their Own Correctness

    Noga Amit, Shafi Goldwasser, Orr Paradise, Guy N. Rothblum · PDF
  31. MSAMamba: Adapting Subquadratic Models To Long-Context DNA MSA Analysis

    Vishrut Thoutam, Dina Ellsworth · PDF
  32. Multilingual Compression Parity: How Efficiently Large Language Models Represent Information Across Languages?

    Alexander Tsvetkov, Alon Kipnis · PDF
  33. On Provable Length and Compositional Generalization

    Kartik Ahuja, Amin Mansouri · PDF
  34. On the Power of Convolution Augmented Transformer

    Mingchen Li, Xuechen Zhang, Yixiao Huang, Samet Oymak · PDF
  35. One-Shot Safety Alignment for Large Language Models via Optimal Dualization

    Xinmeng Huang, Shuo Li, Edgar Dobriban, Osbert Bastani, Hamed Hassani, Dongsheng Ding · PDF
  36. PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models

    Junsong Chen, Simian Luo, Enze Xie · PDF
  37. Preference Learning Algorithms Do Not Learn Preference Rankings

    Angelica Chen, Sadhika Malladi, Lily H Zhang, Xinyi Chen, Qiuyi Zhang, Rajesh Ranganath, Kyunghyun Cho · PDF
  38. Progressive distillation improves feature learning via implicit curriculum

    Abhishek Panigrahi, Bingbin Liu, Sadhika Malladi, Andrej Risteski, Surbhi Goel · PDF
  39. Rethinking Invariance in In-context Learning

    Lizhe Fang, Yifei Wang, Khashayar Gatmiry, Lei Fang, Yisen Wang · PDF
  40. RLHF from Heterogeneous Feedback via Personalization and Preference Aggregation

    Chanwoo Park, Mingyang Liu, Dingwen Kong, Kaiqing Zhang, Asuman E. Ozdaglar · PDF
  41. SAIL: Self-improving Efficient Online Alignment of Large Language Models

    Mucong Ding, Souradip Chakraborty, Vibhu Agrawal, Zora Che, Alec Koppel, Mengdi Wang, Amrit Bedi, Furong Huang · PDF
  42. Self-Play Preference Optimization for Language Model Alignment

    Yue Wu, Zhiqing Sun, Huizhuo Yuan, Kaixuan Ji, Yiming Yang, Quanquan Gu · PDF
  43. Setting the Record Straight on Transformer Oversmoothing

    Gbetondji Jean-Sebastien Dovonon, Michael M. Bronstein, Matt Kusner · PDF
  44. Sparse network initialization using deterministic Ramanujan graphs

    Arindam Biswas, Suryam Arnav Kalra, Pabitra Mitra, BISWAJIT BASU · PDF
  45. State Space Models are Comparable to Transformers in Estimating Functions with Dynamic Smoothness

    Naoki Nishikawa, Taiji Suzuki · PDF
  46. The Geometry of Categorical and Hierarchical Concepts in Large Language Models

    Kiho Park, Yo Joong Choe, Yibo Jiang, Victor Veitch · PDF
  47. Transformer Designs for In-Context Learning in Foundation Models for Time Series Forecasting with Covariates

    Afrin Dange, Vaibhav Raj, Praneeth Netrapalli, Sunita Sarawagi · PDF
  48. Transformer Efficiently Learns Low-dimensional Target Functions In-context

    Yujin Song, Denny Wu, Kazusato Oko, Taiji Suzuki · PDF
  49. Transformers are Minimax Optimal Nonparametric In-Context Learners

    Juno Kim, Tai Nakamaki, Taiji Suzuki · PDF
  50. Transformers need glasses! Information over-squashing in language tasks

    Federico Barbero, Andrea Banino, Steven Kapturowski, Dharshan Kumaran, João Guilherme Madeira Araújo, Alex Vitvitskyi, Razvan Pascanu, Petar Veličković · PDF
  51. Unavoidable Learning Constraints Alter the Foundations of Direct Preference Optimization

    David Wipf · PDF
  52. Understanding and Minimising Outlier Features in Neural Network Training

    Bobby He, Lorenzo Noci, Daniele Paliotta, Imanol Schlag, Thomas Hofmann · PDF
  53. Understanding and Mitigating Tokenization Bias in Language Models

    Buu Phan, Marton Havasi, Matthew J. Muckley, Karen Ullrich · PDF
  54. Understanding the Role of Equivariance in Self-supervised Learning

    Yifei Wang, Kaiwen Hu, Sharut Gupta, Ziyu Ye, Yisen Wang, Stefanie Jegelka · PDF
  55. Unified Taxonomy in AI Safety: Watermarks, Adversarial Defenses, and Transferable Attacks

    Grzegorz Gluch, Sai Ganesh Nagarajan, Berkant Turan · PDF
  56. Unlocking Tokens as Data Points for Generalization Bounds on Larger Language Models

    Sanae Lotfi, Yilun Kuang, Marc Anton Finzi, Brandon Amos, Micah Goldblum, Andrew Gordon Wilson · PDF
  57. Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers

    Siyu Chen, Heejune Sheen, Tianhao Wang, Zhuoran Yang · PDF
  58. Zero-Shot Generalization of GNNs over Distinct Attribute Domains

    Yangyi Shen, Beatrice Bevilacqua, Joshua Robinson, Charilaos Kanatsoulis, Jure Leskovec, Bruno Ribeiro · PDF