ICLR 2025 Past Large language modelsEfficiencyOptimization

First Workshop on Scalable Optimization for Efficient and Adaptive Foundation Models

SCOPE - ICLR 2025

Submission deadline
Feb 10, 2025, 12:05 UTC
imported from OpenReview — check the website for extensions
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (56)

Fetched from OpenReview (v2) on 2026-06-10.

  1. A Unified Approach to Routing and Cascading for LLMs

    Jasper Dekoninck, Maximilian Baader, Martin Vechev · PDF
  2. Acceleration Multiple Heads Decoding for LLM via Dynamic Tree Attention

    Zhendong Zhang · PDF
  3. Adaptive Length Image Tokenization via Recurrent Allocation

    Shivam Duggal, Phillip Isola, Antonio Torralba, William T. Freeman · PDF
  4. AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting

    Abdelhakim Benechehab, Vasilii Feofanov, Giuseppe Paolo, Albert Thomas, Maurizio Filippone, Balázs Kégl · PDF
  5. AsymLoRA: Unlocking the Power of Multimodal LLMs via Asymmetric LoRA

    Xuyang Wei, Chunlin Tian, Li Li · PDF
  6. Attention Is All You Need For Mixture-of-Depths Routing

    Advait Gadhikar, Souptik Kumar Majumdar, Niclas Popp, Piyapat Saranrittichai, Martin Rapp, Lukas Schott · PDF
  7. ChameleonLLM: Batch-Aware Dynamic Low-Rank Adaptation via Inference-Time Clusters

    Kamer Ali Yuksel, Hassan Sawaf · PDF
  8. Compositional Subspace Representation Fine-tuning for Adaptive Large Language Models

    Andy Zhou, Ron Arel · PDF
  9. Conformal Transformations for Symmetric Power Transformers

    Saurabh Kumar, Jacob Buckman, Carles Gelada, Xiaowen Zhang · PDF
  10. Context Is All You Need: Efficient Retrieval Augmented Generation for Domain Specific AI

    Peixi Xiong, Chaunte W. Lacewell, Sameh Gobriel, Nilesh Jain · PDF
  11. DARS : ROBUST SPARSE FINE-TUNING WITH REGULARIZED SUBSPACE DISALIGNMENT

    Sumin Park, Noseong Park · PDF
  12. DLO: Dynamic Layer Operation for Efficient Vertical Scaling of LLMs

    Zhen Tan, Daize Dong, Xinyu Zhao, Jianing Cai, Jie Peng, Yu Cheng, Tianlong Chen · PDF
  13. Domain-Invariant Prompt Learning for Vision-Language Models

    Arsham Gholamzadeh Khoee, Yinan Yu, Robert Feldt · PDF
  14. Efficient Distributed Optimization under Heavy-Tailed Noise

    Su Hyeong Lee, Manzil Zaheer, Tian Li · PDF
  15. Efficient Open-set Test Time Adaptation of Vision Language Models

    Manogna Sreenivas, Soma Biswas · PDF
  16. Effortless Efficiency: Low-Cost Pruning of Diffusion Models

    Yang Zhang, Er Jin, Yanfei Dong, Ashkan Khakzar, Philip Torr, Johannes Stegmaier, Kenji Kawaguchi · PDF
  17. Enhanced Continual Learning of Vision-Language Models with Model Fusion

    Haoyuan Gao, Zicong Zhang, Yuqi Wei, Linglan Zhao, Guilin Li, Yexin Li, Linghe Kong, Weiran Huang · PDF
  18. Fast Gradient Computation for RoPE Attention in Almost Linear Time

    Yifang Chen, Jiayan Huo, Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song · PDF
  19. FedEx-LoRA: Exact Aggregation for Federated and Efficient Fine-Tuning of Foundation Models

    Raghav Singhal, Kaustubh Ponkshe, Praneeth Vepakomma · PDF
  20. Fixed-Point RNNs: From Diagonal to Dense in a Few Iterations

    Sajad Movahedi, Felix Sarnthein, Nicola Muca Cirone, Antonio Orvieto · PDF
  21. Grams: Gradient Descent with Adaptive Momentum Scaling

    Yang Cao, Xiaoyu Li, Zhao Song · PDF
  22. Graph Low-Rank Adapters of High Regularity for Graph Neural Networks and Graph Transformers

    Pantelis Papageorgiou, Haitz Sáez de Ocáriz Borde, Anastasis Kratsios, Michael M. Bronstein · PDF
  23. In-batch Ensemble Drafting: Robust Speculative Decoding for LVLMs

    Minjae Lee, Wonjun Kang, Byeongkeun Ahn, Christian Classen, Minghao Yan, Hyung Il Koo, Kangwook Lee · PDF
  24. Inference Optimal VLMs Need Fewer Visual Tokens and More Parameters

    Kevin Li, Sachin Goyal, João D. Semedo, J Zico Kolter · PDF
  25. Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning

    Kaustubh Ponkshe, Raghav Singhal, Eduard Gorbunov, Alexey Tumanov, Samuel Horváth, Praneeth Vepakomma · PDF
  26. KV Prediction for Improved Time to First Token

    Maxwell Horton, Qingqing Cao, Chenfan Sun, Yanzi Jin, Sachin Mehta, Mohammad Rastegari, Moin Nabi · PDF
  27. LANTERN++: Enhancing Relaxed Speculative Decoding with Static Tree Drafting for Visual Auto-regressive Models

    Sihwan Park, Doohyuk Jang, Sung-Yub Kim, Souvik Kundu, Eunho Yang · PDF
  28. Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts

    Weigao Sun, Disen Lan, Tong Zhu, Xiaoye Qu, Yu Cheng · PDF
  29. Llamba: Scaling Distilled Recurrent Models for Efficient Language Processing

    Aviv Bick, Tobias Katsch, Nimit Sharad Sohoni, Arjun D Desai, Albert Gu · PDF
  30. Low-Rank Continual Personalization of Diffusion Models

    Łukasz Staniszewski, Katarzyna Zaleska, Kamil Deja · PDF
  31. M2R2: EFFICIENT TRANSFORMERS WITH MIXTURE OF MULTI-RATE RESIDUALS

    Nikhil Bhendawade, Mahyar Najibi, Devang Naik, Irina Belousova · PDF
  32. Margin-aware Preference Optimization for Aligning Diffusion Models without Reference

    Jiwoo Hong, Sayak Paul, Noah Lee, Kashif Rasul, James Thorne, Jongheon Jeong · PDF
  33. MixER: Better Mixture of Experts Routing for Hierarchical Meta-Learning

    Roussel Desmond Nzoyem, Grant Stevens, Amarpal Sahota, David A.W. Barton, Tom Deakin · PDF
  34. Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity

    Weixin Liang, Junhong Shen, Genghan Zhang, Ning Dong, Luke Zettlemoyer, LILI YU · PDF
  35. N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs

    Ilya Zisman, Alexander Nikulin, Viacheslav Sinii, Denis Tarasov, Lyubaykin Nikita, Andrei Polubarov, Igor Kiselev, Vladislav Kurenkov · PDF
  36. Neuromorphic Principles for Efficient Large Language Models on Intel Loihi 2

    Steven Abreu, Sumit Bam Shrestha, Rui-Jie Zhu, Jason Eshraghian · PDF
  37. On Vanishing Variance in Transformer Length Generalization

    Ruining Li, Gabrijel Boduljak, Jensen Zhou · PDF
  38. OPPA: OPtimizing PArallelism for Language Model Training

    Apivich Hemachandra, Yizhan Han, See-Kiong Ng, Bryan Kian Hsiang Low · PDF
  39. Overtrained Language Models Are Harder to Fine-Tune

    Jacob Mitchell Springer, Sachin Goyal, Kaiyue Wen, Tanishq Kumar, Xiang Yue, Sadhika Malladi, Graham Neubig, Aditi Raghunathan · PDF
  40. PENCIL: Long Thoughts with Short Memory

    Chenxiao Yang, Nathan Srebro, David McAllester, Zhiyuan Li · PDF
  41. QMambaExtend: Improving Long-Context Extension of Memory-Efficient Mamba Models

    Seyedarmin Azizi, Souvik Kundu, Mohammad Erfan Sadeghi, Massoud Pedram · PDF
  42. RecurFormer: Not All Transformer Heads Need Self-Attention

    RuiqingYan, Linghan Zheng, Xingbo Du, Han Zou, Yufeng Guo, Jianfei Yang · PDF
  43. Relevance Isn't All You Need: Scaling RAG Systems With Inference-Time Compute Via Multi-Criteria Reranking

    Will LeVine, Bijan Varjavand · PDF
  44. ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residuals

    Utkarsh Saxena, Sayeh Sharify, Kaushik Roy, Xin Wang · PDF
  45. Revisiting Associative Recall in Modern Recurrent Models

    Destiny Okpekpe, Antonio Orvieto · PDF
  46. SageAttention2: Efficient Attention with Smoothing Q and Per-thread Quantization

    Jintao Zhang, Haofeng Huang, Pengle Zhang, Jia wei, Jun Zhu, Jianfei Chen · PDF
  47. SPAM: SPIKE-AWARE ADAM WITH MOMENTUM RESET FOR STABLE LLM TRAINING

    Tianjin Huang, Ziquan Zhu, Gaojie Jin, Lu Liu, Zhangyang Wang, Shiwei Liu · PDF
  48. Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam

    Tianjin Huang, Haotian Hu, Zhenyu Zhang, Gaojie Jin, Xiang Li, Li Shen, Tianlong Chen, Lu Liu, Qingsong Wen, Zhangyang Wang, Shiwei Liu · PDF
  49. STIV: SCALABLE TEXT AND IMAGE CONDITIONED VIDEO GENERATION

    Zongyu Lin, Wei Liu, Chen Chen, Jiasen Lu, Wenze Hu, Tsu-Jui Fu, Jesse Allardice, Zhengfeng Lai, Liangchen Song, Bowen Zhang, cha chen, Yiran Fei, Yifan Jiang, Lezhi Li, Yizhou Sun, Kai-Wei Chang, Yinfei Yang · PDF
  50. The Curse of Depth in Large Language Models

    Wenfang Sun, Xinyuan Song, Pengxiang Li, Lu Yin, Yefeng Zheng, Shiwei Liu · PDF
  51. Towards Infinite-Long Prefix in Transformers

    Yingyu Liang, Zhenmei Shi, Zhao Song, Chiwun Yang · PDF
  52. Training Domain Draft Models for Speculative Decoding: Best Practices and Insights

    Fenglu Hong, Ravi Shanker Raju, Jonathan Lingjie Li, Bo Li, Urmish Thakker, Avinash Ravichandran, Swayambhoo Jain, Changran Hu · PDF
  53. UniForm: A Reuse Attention Mechanism for Efficient Transformers on Resource-Constrained Edge Devices

    Seul-Ki Yeom, Tae-Ho Kim · PDF
  54. Universal LLM Routing with Correctness-Based Representation

    Wittawat Jitkrittum, Harikrishna Narasimhan, Ankit Singh Rawat, Jeevesh Juneja, Zifeng Wang, Chen-Yu Lee, Pradeep Shenoy, Rina Panigrahy, Aditya Krishna Menon, Sanjiv Kumar · PDF
  55. XAMBA: Enabling Efficient State Space Models on Resource-Constrained Neural Processing Units

    Arghadip Das, Arnab Raha, Shamik Kundu, Soumendu Kumar Ghosh, Deepak Mathaikutty, Vijay Raghunathan · PDF
  56. Yes, Q-learning Helps Offline In-Context RL

    Denis Tarasov, Alexander Nikulin, Ilya Zisman, Albina Klepach, Andrei Polubarov, Lyubaykin Nikita, Alexander Derevyagin, Igor Kiselev, Vladislav Kurenkov · PDF