ICML 2024 Past Large language modelsTheory
ICML 2024 Workshop on Theoretical Foundations of Foundation Models
TF2M 2024
- Submission deadline
- Jun 1, 2024, 11:59 UTC imported from OpenReview — check the website for extensions
- Submission portal
- OpenReview
- Notes
- Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).
Accepted papers (58)
Fetched from OpenReview (v2) on 2026-06-10.
-
A deeper look at depth pruning of LLMs
-
A Theoretical Understanding of Self-Correction through In-context Alignment
-
Active Preference Optimization for Sample Efficient RLHF
-
Attention Is All You Need But You Don’t Need All Of It For Inference of Large Language Models
-
Beyond Model Collapse: Scaling Up with Synthesized Data Requires Reinforcement
-
Decoding-Time Language Model Alignment with Multiple Objectives
-
Detrimental Memories in Transfer Learning
-
Do LLM Agents Have Regret? A Case Study in Online Learning and Games
-
Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers
-
Efficient Document Ranking with Learnable Late Interactions
-
Fast Machine Unlearning via Robust Training
-
Fine-Tuning Large Language Models with User-Level Differential Privacy
-
Fundamental Limits of Prompt Compression: A Rate-Distortion Framework for Black-Box Language Models
-
Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment
-
Hallmarks of Optimization Trajectories in Neural Networks and LLMs: Directional Exploration and Redundancy
-
How Do Nonlinear Transformers Acquire Generalization-Guaranteed CoT Ability?
-
How Do Transformers Fill in the Blanks? A Case Study on Matrix Completion
-
How Transformers Learn Diverse Attention Correlations in Masked Vision Pretraining
-
How Transformers Utilize Multi-Head Attention in In-Context Learning? A Case Study on Sparse Linear Regression
-
Implementability of Information Elicitation Mechanisms with Pre-Trained Language Models
-
Implicit Optimization Bias of Next-token Prediction in Linear Models
-
Implicit Regularization of Sharpness-Aware Minimization for Scale-Invariant Problems
-
ImportanceWeighted Multi-Draft Speculative Sampling
-
In-Context Learning from Training on Unstructured Data: The Role of Co-Occurrence, Positional Information, and Training Data Structure
-
In-Context Learning with Representations: Contextual Generalization of Trained Transformers
-
Local to Global: Learning Dynamics and Effect of Initialization for Transformers
-
Meta-optimization for Deep Learning via Nonstochastic Control
-
Mission Impossible: A Statistical Perspective on Jailbreaking LLMs
-
Modeling the Plurality of Human Preferences via Ideal Points
-
Models That Prove Their Own Correctness
-
MSAMamba: Adapting Subquadratic Models To Long-Context DNA MSA Analysis
-
Multilingual Compression Parity: How Efficiently Large Language Models Represent Information Across Languages?
-
On Provable Length and Compositional Generalization
-
On the Power of Convolution Augmented Transformer
-
One-Shot Safety Alignment for Large Language Models via Optimal Dualization
-
PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models
-
Preference Learning Algorithms Do Not Learn Preference Rankings
-
Progressive distillation improves feature learning via implicit curriculum
-
Rethinking Invariance in In-context Learning
-
RLHF from Heterogeneous Feedback via Personalization and Preference Aggregation
-
SAIL: Self-improving Efficient Online Alignment of Large Language Models
-
Self-Play Preference Optimization for Language Model Alignment
-
Setting the Record Straight on Transformer Oversmoothing
-
Sparse network initialization using deterministic Ramanujan graphs
-
State Space Models are Comparable to Transformers in Estimating Functions with Dynamic Smoothness
-
The Geometry of Categorical and Hierarchical Concepts in Large Language Models
-
Transformer Designs for In-Context Learning in Foundation Models for Time Series Forecasting with Covariates
-
Transformer Efficiently Learns Low-dimensional Target Functions In-context
-
Transformers are Minimax Optimal Nonparametric In-Context Learners
-
Transformers need glasses! Information over-squashing in language tasks
-
Unavoidable Learning Constraints Alter the Foundations of Direct Preference Optimization
-
Understanding and Minimising Outlier Features in Neural Network Training
-
Understanding and Mitigating Tokenization Bias in Language Models
-
Understanding the Role of Equivariance in Self-supervised Learning
-
Unified Taxonomy in AI Safety: Watermarks, Adversarial Defenses, and Transferable Attacks
-
Unlocking Tokens as Data Points for Generalization Bounds on Larger Language Models
-
Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers
-
Zero-Shot Generalization of GNNs over Distinct Attribute Domains