NeurIPS 2025 Past Math & reasoningEfficiency

NeurIPS 2025 Workshop on Efficient Reasoning

NeurIPS 2025 ER Workshop

Submission deadline
Oct 2, 2025, 11:59 UTC
imported from OpenReview — check the website for extensions
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (223)

Fetched from OpenReview (v2) on 2026-06-10.

  1. A Cooperation Index for Model Pruning

    Do-Hoon Kim, Jay Myung, Yung-Kyun Noh · PDF
  2. A Simple "Motivation" Can Enhance Reinforcement Finetuning of Large Reasoning Models

    Junjie Zhang, Guozheng Ma, Shunyu Liu, Haoyu Wang, Jiaxing Huang, Ting-En Lin, Fei Huang, Yongbin Li, Dacheng Tao · PDF
  3. Activation Steering for Chain-of-Thought Compression

    Seyedarmin Azizi, Erfan Baghaei Potraghloo, Souvik Kundu, Massoud Pedram · PDF
  4. Active Inference Control: Steering, Not Just Scaling, Language Model Reasoning

    Josh Karthikeyan, Kai Fu, Derek Jiu, Ryan Lagasse, Kevin Zhu · PDF
  5. AdaptDistill: Improving Small Language Models with Skill-Aware Teaching

    Yinghui He, Abhishek Panigrahi, Yong Lin, Sanjeev Arora · PDF
  6. AdaptInfer: Adaptive Token Pruning for Vision–Language Model Inference with Dynamical Text Guidance

    Weichen Zhang, Zhui Zhu, Kebin Liu, Yunhao Liu · PDF
  7. Adaptive Dual Reasoner: Large Reasoning Models Can Think Efficiently by Hybrid Reasoning

    YuJian Zhang, Keyu Chen, Zhifeng Shen, Ruizhi Qiao, Xing Sun · PDF
  8. Adaptive Guidance Accelerates Reinforcement Learning of Reasoning Models

    Vaskar Nath, Elaine Lau, Anisha Gunjal, Manasi Sharma, Nikhil Barhate, Sean M. Hendryx · PDF
  9. Agentic NL2SQL to Reduce Computational Costs

    Dominik Jehle, Lennart Purucker, Frank Hutter · PDF
  10. AGENTIQL: An Agent-Inspired Multi-Expert Framework for Text-to-SQL Generation

    Omid Reza Heidari, Siobhan Reid, Yassine Yaakoubi · PDF
  11. Amortized Latent Steering: Low-Cost Alternative to Test-Time Optimization

    Nathan Egbuna, Saatvik Gaur, Kevin Zhu, Sunishchal Dev, Ashwinee Panda, Maheep Chaudhary · PDF
  12. An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in Multimodal Reasoning Models

    Changwoo Baek, Jouwon Song, Sohyeon Kim, Kyeongbo Kong · PDF
  13. Analysis of Emergence of Reasoning in Language Models: Factors, Thresholds and Interpretations

    Yen-Che Hsiao, Abhishek Dutta · PDF
  14. Are We Scaling the Right Thing? A System Perspective on Test-Time Scaling

    Youpeng Zhao, Jinpeng Lv, Di Wu, Jun Wang · PDF
  15. ARM: Adaptive Reasoning Model

    Siye Wu, Jian Xie, Yikai Zhang, Aili Chen, Kai Zhang, Yu Su, Yanghua Xiao · PDF
  16. Attention Guided Alignment in Efficient Vision-Language Models

    Shweta Mahajan, Hoang Le, Hyojin Park, Farzad Farhadzadeh, Munawar Hayat, Fatih Porikli · PDF
  17. AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models

    Feng Luo, Yu-Neng Chuang, Guanchu Wang, Hoang Anh Duy Le, Shaochen Zhong, Hongyi Liu, Jiayi Yuan, Yang Sui, Vladimir Braverman, Vipin Chaudhary, Xia Hu · PDF
  18. Bayesian Social Deduction with Graph-Informed Language Models

    Shahab Rahimirad, Guven Gergerli, Lucia Romero, Angela Qian, Matthew Lyle Olson, Simon Stepputtis, Joseph Campbell · PDF
  19. BenchHub: A Unified Benchmark Suite for Holistic and Customizable LLM Evaluation

    Eunsu Kim, Haneul Yoo, Guijin Son, Hitesh Laxmichand Patel, Amit Agarwal, Alice Oh · PDF
  20. Beyond Static Cutoffs: One-Shot Dynamic Thresholding for Diffusion Language Models

    Jucheng Shen, Yeonju Ro · PDF
  21. Boundary Guidance for Efficient 3D CT Vision–Language Reasoning

    Soo Yong Kim · PDF
  22. Breadcrumbs Reasoning: Memory-Efficient Reasoning with Compression Beacons

    Giovanni Monea, Yair Feldman, Shankar Padmanabhan, Kianté Brantley, Yoav Artzi · PDF
  23. Budget-aware Test-time Scaling via Discriminative Verification

    Kyle Montgomery, Sijun Tan, Yuqi Chen, Siyuan Zhuang, Tianjun Zhang, Raluca Ada Popa, Chenguang Wang · PDF
  24. Calibrated Reasoning: An Explanatory Verifier for Dynamic and Efficient Problem-Solving

    Anisha Garg, Engin Tekin, Yash More, David Bick, Ganesh Venkatesh · PDF
  25. Can Explanations Improve Recommendations? A Joint Optimization with LLM Reasoning

    Yuyan Wang, Pan Li, Minmin Chen · PDF
  26. CaRT: Teaching LLM Agents to Know When They Know Enough

    Grace Liu, Yuxiao Qu, Jeff Schneider, Aarti Singh, Aviral Kumar · PDF
  27. CATS: Category-Aware Token-level Steering for Training-Free Redundancy Reduction in Large Reasoning Models

    Zhang Mengfei, Zhenglin Wang · PDF
  28. Causal Reflection with Language Models

    Abi Aryan, Zac Yung-Chun Liu · PDF
  29. CGES: Confidence-Guided Early Stopping for Efficient and Accurate Self-Consistency

    Ehsan Aghazadeh, Ahmad Ghasemi, Hedyeh Beyhaghi, Hossein Pishro-Nik · PDF
  30. Chopping Trees: Semantic Similarity Based Dynamic Pruning for Tree-of-Thought Reasoning

    Joongho Kim, Xirui Huang, Zarreen Reza, Gabriel Grand, Kevin Zhu, Ryan Lagasse · PDF
  31. Coevolutionary Continuous Discrete Diffusion: Make Your Diffusion Language Model a Latent Reasoner

    Cai Zhou, Chenxiao Yang, Yi Hu, Chenyu Wang, Chubin Zhang, Muhan Zhang, Lester Mackey, Tommi Jaakkola, Stephen Bates, Dinghuai Zhang · PDF
  32. COMPACT: Common-token Optimized Model Pruning Across Channels and Tokens

    Eugene Kwek, Wenpeng Yin · PDF
  33. Compute as Teacher: Turning Inference Compute Into Reference-Free Supervision

    Dulhan Jayalath, Shashwat Goel, Thomas Foster, Parag Jain, Suchin Gururangan, Cheng Zhang, Anirudh Goyal, Alan Schelten · PDF
  34. Compute When Worth It: Risk Control for Reasoning on a Compute Budget

    Anushri Suresh, Alvin Zhang, Rishi More, William Jurayj, Benjamin Van Durme, Eric Nalisnick, Daniel Khashabi · PDF
  35. Confidence-Coverage Gating for Early Exit

    Aaroosh Rustagi, Hsien Xin Peng, Khushal Murthy, Attrey Koul, Ryan Lagasse, Kevin Zhu · PDF
  36. ConstrainedSQL: Training LLMs for Text2SQL via Constrained Reinforcement Learning

    Weiqin Chen, Nhan H Pham, Michael Glass, Long H. Vu, Gaetano Rossiello, Dharmashankar Subramanian, Santiago Paternain · PDF
  37. Correct Reasoning Paths Visit Shared Decision Pivots

    Dongkyu Cho, Amy B.Z. Zhang, Bilel Fehri, Sheng Wang, Rumi Chunara, Hengrui Cai, Rui Song · PDF
  38. DA-CoTD: Efficient Chain-of-Thought Reasoning with Difficulty-Aware CoT-Distillation

    Abdul Waheed, Chancharik Mitra, Laurie Z. Wang · PDF
  39. DAG-Math: Graph-Guided Mathematical Reasoning in LLMs

    Yuanhe Zhang, Ilja Kuzborskij, Jason D. Lee, Chenlei Leng, Fanghui Liu · PDF
  40. Data Diversification Methods In Alignment Enhance Math Performance In LLMs

    Berkan Dokmeci, Qingyang Wu, Ben Athiwaratkun, Ce Zhang, Shuaiwen Leon Song, James Zou · PDF
  41. Data Scaling Isn't Enough: Towards Improving Compositional Reasoning in Video-Language Models

    Kibum Kim, Kyle Min, Chanyoung Park · PDF
  42. Decomposing Reasoning Efficiency in Large Language Models

    Daniel Kaiser, Ali Ramezani-Kebrya, Arnoldo Frigessi, Benjamin Ricaud · PDF
  43. Deep Think with Confidence

    Yichao Fu, Xuewei Wang, Yuandong Tian, Jiawei Zhao · PDF
  44. Delta Activations: A Representation for Finetuned Large Language Models

    Zhiqiu Xu, Amish Sethi, Mayur Naik, Ser-Nam Lim · PDF
  45. Demystifying and Enhancing the Efficiency of Interleaved Reasoning-Search LLM Agents

    Tiannuo Yang, Zebin Yao, Bowen Jin, Lixiao Cui, Yusen Li, Gang Wang, xiaoguang Liu, Willie Neiswanger · PDF
  46. Demystifying Delays in Reasoning: A Pilot Temporal and Token Analysis of Reasoning Systems

    Qi Qi, Reyna Abhyankar, Yiying Zhang · PDF
  47. DHP: Discrete Hierarchical Planning for HRL Agents

    Shashank Sharma, Janina Anna Hoffmann, Vinay P. Namboodiri · PDF
  48. DiFFPO: Training Diffusion LLMs to Reason Fast and Furious via Reinforcement Learning

    Hanyang Zhao, Dawen Liang, Wenpin Tang, David Yao, Nathan Kallus · PDF
  49. Diffusion Language Models Know the Answer Before Decoding

    Pengxiang Li, Yefan Zhou, Dilxat Muhtar, Lu Yin, Shilin Yan, Li Shen, Yi Liang, Soroush Vosoughi, Shiwei Liu · PDF
  50. DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data

    Yuhang Zhou, Jing Zhu, Shengyi Qian, Zhuokai Zhao, Xiyao Wang, Xiaoyu Liu, Ming Li, Paiheng Xu, Wei Ai, Furong Huang · PDF
  51. Distilling Multi-modal Large Language Models for Autonomous Driving

    Deepti Hegde, Rajeev Yasarla, Hong Cai, Shizhong Han, Apratim Bhattacharyya, Shweta Mahajan, Litian Liu, Risheek Garrepalli, Vishal M. Patel, Fatih Porikli · PDF
  52. DIVERSED: Relaxed Speculative Decoding via Dynamic Ensemble Verification

    Ziyi Wang, Siva Rajesh Kasa, Ankith M S, Santhosh Kumar Kasa, Jiaru Zou, Nan Jiang, Sumit Negi, Ruqi Zhang, Qifan Song · PDF
  53. DMORE: Differentiable Mixture-of-Reasoning-Experts with Uncertainty-Guided Multi-Level Routing

    Roman Sultimov, Aleksandr Volkov, Mariia Kovalchuk, Yury Maximov · PDF
  54. Don’t Let It Hallucinate: Premise Verification via Retrieval-Augmented Logical Reasoning

    Yuehan Qin, Li Li, Yi Nian, Xinyan Velocity Yu, Yue Zhao, Xuezhe Ma · PDF
  55. DRPO: Efficient Reasoning via Decoupled Reward Policy Optimization

    Yan Chen, Gang Li · PDF
  56. DTS: Enhancing Large Reasoning Models via Decoding Tree Sketching

    Zicheng Xu, Guanchu Wang, Yu-Neng Chuang, Guangyao Zheng, Alex Szalay, Zirui Liu, Vladimir Braverman · PDF
  57. Dual-Head Reasoning Distillation: Improving Classifier Accuracy with Train-Time-Only Reasoning

    Jillian Xu, Dylan Zhou, Vinay Shukla, Yang Yang, Junrui Ruan, Shuhuai Lin, Wenfei Zou, Yinxiao Liu, Karthik lakshmanan · PDF
  58. e1: Learning Adaptive Control of Reasoning Effort

    Michael Kleinman, Matthew Trager, Alessandro Achille, Wei Xia, Stefano Soatto · PDF
  59. EcoSpa: Efficient Transformer Training with Coupled Sparsity

    Jinqi Xiao, Cheng Luo, Lingyi Huang, Cheng Yang, Yang Sui, Huy Phan, Xiao Zang, Yibiao Ying, Anima Anandkumar, Bo Yuan · PDF
  60. Efficient Long CoT Reasoning in Small Language Models

    Zhaoyang Wang, Jinqi Jiang, Tian Qiu, Hui Liu, Xianfeng Tang, Huaxiu Yao · PDF
  61. Efficient Parallel Samplers for Recurrent-Depth Models and Their Connections to Diffusion Language Models

    Jonas Geiping, Xinyu Yang, Guinan Su · PDF
  62. Efficient Post-Training for Industry-Specialized Reasoning in Small Language Models

    Bill Cai, Sheldon Liu, Tatsuo Azeyanagi, Tomal Deb · PDF
  63. Efficient Reasoning at Fixed Test-Time Cost via Length-Aware Attention Priors and Gain-Aware Training

    Rian Atri · PDF
  64. Efficient Reinforcement Finetuning via Adaptive Curriculum Learning

    Taiwei Shi, Yiyang Wu, Linxin Song, Tianyi Zhou, Jieyu Zhao · PDF
  65. Efficient Reinforcement Learning for Large Language Models with Intrinsic Exploration

    Yan Sun, Jia Guo, Stanley Kok, Zihao Wang, zujie wen, Zhiqiang Zhang · PDF
  66. Efficient RL Training for Reasoning Models via Length-Aware Optimization

    Danlong Yuan, Tian Xie, Shaohan Huang, Zhuocheng Gong, Huishuai Zhang, Chong Luo, Furu Wei, Dongyan Zhao · PDF
  67. Efficient Sparse Decoding for Test-Time Scaling with KV Cache Disaggregation and Asynchronism

    Shuqing Luo, Yilin Guan, Hanrui Wang, Tianlong Chen · PDF
  68. Efficient Test-Time Scaling via Self-Calibration

    Chengsong Huang, Langlin Huang, Jixuan Leng, Jiacheng Liu, Jiaxin Huang · PDF
  69. Episode-Level Multimodal KV Caching for Embodied Question Answering

    Hyobin Ong, Minsu Jang · PDF
  70. Evaluating the Safety and Skill Reasoning of Large Reasoning Models Under Compute Constraints

    Adarsha Balaji, Le Chen, Rajeev Thakur, Franck Cappello, Sandeep Madireddy · PDF
  71. EWoRA: Expert Weighted Low-Rank Adaptation for Reasoning over Heterogeneous Data

    Harsh Kohli, Helian Feng, Lenon Minorics, Bhoomit Vasani, Xin He, Ali Kebarighotbi · PDF
  72. Extending AutoCompressors via Surprisal-Based Dynamic Segmentation

    Srivishnu Ramamurthi, Richard Xu, Raine Ma, Dawson Park, David Guo, Charles Duong, Vasu Sharma, Sean O'Brien, Kevin Zhu · PDF
  73. Feature-Level Knowledge Distillation from LMM for Enhanced Image Classification

    Bumsu Jang, Heechul Jung · PDF
  74. Federation of Agents: A Semantics-Aware Communication Fabric for Large-Scale Agentic AI

    Lorenzo Giusti, Ole Anton Werner, Riccardo Taiello, Matilde Carvalho Costa, Emre Tosun, Andrea Protani, Marc Molina, Rodrigo Lopes de Almeida, Paolo Cacace, Diogo Reis Santos, Luigi Serio · PDF
  75. Finding the Sweet Spot: Trading Quality, Cost, and Speed During Inference-Time LLM Reflection

    Jack Butler, Nikita Kozodoi, Zainab Afolabi · PDF
  76. Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute

    Sheng Liu, Tianlang Chen, Pan Lu, Haotian Ye, Yizheng Chen, Lei Xing, James Zou · PDF
  77. Free Draft-and-Verification: Toward Lossless Parallel Decoding for Diffusion Large Language Models

    Shutong Wu, Jiawei Zhang · PDF
  78. From Evidence to Trajectory: Abductive Reasoning Path Synthesis for Retrieval-Augmented Generation Agents Development

    Muzhi Li, Jinhu Qi, Yihong Wu, Minghao Zhao, Liheng Ma, Yifan Li, Xinyu Wang, Yingxue Zhang, Ho-fung Leung, Irwin King · PDF
  79. From Long to Short: LLMs Excel at Trimming Own Reasoning Chains

    Wei Han, Geng Zhan, Sicheng Yu, Chenyu Wang, Bryan Hooi · PDF
  80. FrugalRAG: Learning to retrieve and reason for multi-hop QA

    Abhinav Java, Srivathsan Koundinyan, Nagarajan Natarajan, Amit Sharma · PDF
  81. GEAR-X: Expanders for Next-Gen KV Cache Compression

    Vivek Mirani, Garima Bansal, Pabitra Mitra, Arindam Biswas, Amaljith EV · PDF
  82. Generalized Parallel Scaling with Interdependent Generations

    Harry Dong, David Brandfonbrener, Eryk Helenowski, Yun He, Mrinal Kumar, Han Fang, Yuejie Chi, Karthik Abinav Sankararaman · PDF
  83. Generating Domain Specific Natural Language SAT Reasoning Datasets

    Sunandita Patra, Keshav Ramani, Daniel Borrajo, Sriram Gopalakrishnan · PDF
  84. Hard Examples Are All You Need: Maximizing GRPO Post-Training Under Annotation Budgets

    Benjamin Pikus, Pratyush Ranjan Tiwari, Burton Ye · PDF
  85. Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning

    Shuyao Xu, Cheng Peng, Jiangxuan Long, Weidi Xu, Wei Chu, Yuan Qi · PDF
  86. Hierarchical Planning Agent for Web-Browsing Tasks

    Elita Lobo, Xu Chen, Jingjing Meng, Nan Xi, Yang Jiao, Yanhui Guo, Zhishen Huang, Yan Gao · PDF
  87. Hold Onto That Thought: Assessing KV Cache Compression On Reasoning

    Minghui Liu, Aadi Palnitkar, Tahseen Rabbani, Hyunwoo Jae, Kyle Rui Sang, Dixi Yao, Shayan Shabihi, Fuheng Zhao, Tian Li, Ce Zhang, Furong Huang, Kunpeng Zhang · PDF
  88. How Does RL Induce Skill Composition? A Case Study Using Countdown

    Simon Park, Simran Kaur, Sanjeev Arora · PDF
  89. How Far Can SLMs Go Without `Thinking' in the LLM-as-a-Judge Paradigm?

    Pratik Sridatt Jayarao, Himanshu Gupta, Neeraj Varshney, Chaitanya Dwivedi · PDF
  90. How Weight Pruning Destroys Chain-of-Thought Reasoning in Language Reasoning Models: A Model Similarity and Faithfulness Correlation Analysis

    AVINASH KUMAR SHARMA, Tushar Shinde · PDF
  91. HybridCoT: Interleaving Latent and Text Chain-of-Thought for Efficient Reasoning

    Shannon Zejiang Shen, Rulin Shao, Chenyu Wang, Songlin Yang, Vincent-Pierre Berges, Gargi Ghosh, Pang Wei Koh, Luke Zettlemoyer, Yoon Kim, Jason E Weston, David Sontag, Wen-tau Yih · PDF
  92. Hydra: A Modular Architecture for Efficient Long-Context Reasoning

    Siddharth Chaudhary, Dev Patel, Maheep Chaudhary, Bennett Browning · PDF
  93. Improving LLM Reasoning under Uncertainty with Coach-Player Multi-agent

    Heewon Park, Minhae Kwon · PDF
  94. In Good GRACEs: Principled Teacher Selection for Knowledge Distillation

    Abhishek Panigrahi, Bingbin Liu, Sadhika Malladi, Sham M. Kakade, Surbhi Goel · PDF
  95. In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

    Zhuofeng Li, Haoxiang Zhang, Seungju Han, Sheng Liu, Jianwen Xie, Yu Zhang, Yejin Choi, James Zou, Pan Lu · PDF
  96. Inference-Time Chain-of-Thought Pruning with Latent Informativeness Signals

    Sophie Li, Nicholas Huang, Nayan Saxena, Nina Luo, Vincent Lin, Kevin Zhu, Sunishchal Dev · PDF
  97. Influence Functions for Efficient Data Selection in Reasoning

    Prateek Humane, Paolo Cudrano, Daniel Z Kaplan, Matteo Matteucci, Supriyo Chakraborty, Irina Rish · PDF
  98. Information-Theoretic Bounds on Multi-Step Reasoning: When is Chain-of-Thought Provably Necessary?

    Karthik Srikumar · PDF
  99. Inpainting-Guided Policy Optimization for Diffusion Large Language Models

    Siyan Zhao, Mengchen Liu, Jing Huang, Miao Liu, Chenyu Wang, Bo Liu, Yuandong Tian, Guan Pang, Sean Bell, Aditya Grover, Feiyu Chen · PDF
  100. Instance-Adaptive Inference-Time Scaling with Calibrated Process Reward Models

    Young-Jin Park, Kristjan Greenewald, Kaveh Alim, Hao Wang, Navid Azizan · PDF
  101. Internal Value Functions: Leveraging Hidden States for Efficient Test-Time Scaling in Large Reasoning Models

    Khiem Pham, Sai Muralidhar Jayanthi, Saket Dingliwal, Bhavana Ganesh, Karthik Valmeekam, Xiangchen Song, Vivek Govindan, Beidi Chen, Sravan Babu Bodapati, Aram Galstyan · PDF
  102. iOS as Acceleration

    Alexander Kai Chen · PDF
  103. It Takes Two: Your GRPO Is Secretly DPO

    Yihong Wu, Liheng Ma, Lei Ding, Muzhi Li, Xinyu Wang, Kejia Chen, Zhan Su, Zhanguang Zhang, Chenyang Huang, Yingxue Zhang, Mark Coates, Jian-Yun Nie · PDF
  104. Just Enough Thinking: Efficient Reasoning with Adaptive Length Penalties Reinforcement Learning

    Violet Xiang, Chase Blagden, Rafael Rafailov, Nathan Lile, Sang T. Truong, Chelsea Finn, Nick Haber · PDF
  105. Lark: Biologically Inspired Neuroevolution for Multi-Stakeholder LLM Agents

    Rikhil Tanugula, Dheeraj Chintapalli, Sunkalp Chandra · PDF
  106. LayerMerge: Modality-Agnostic Depth Pruning for Efficient Foundation Model Deployment

    Arjun Choudhry, Chang Liu, Nina Żukowska, Yifu Cai, Mononito Goswami, Artur Dubrawski · PDF
  107. Learnable Adaptive KV-cache Compression

    Erik Arakelyan, Boris Ginsburg · PDF
  108. Learning to Reason Across Parallel Samples for LLM Reasoning

    Jianing Qi, Xi Ye, Hao Tang, Zhigang Zhu, Eunsol Choi · PDF
  109. Learning to Reason via Mixture-of-Thought for Logical Reasoning

    Tong Zheng, Lichang Chen, Simeng Han, R. Thomas McCoy, Heng Huang · PDF
  110. Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training

    Junlin Han, Shengbang Tong, David Fan, Yufan Ren, Koustuv Sinha, Philip Torr, Filippos Kokkinos · PDF
  111. Less is Not Worse: Effective Reasoning Without Complete Reasoning Chains

    Jaehui Hwang, Sangdoo Yun, Byeongho Heo, Dongyoon Han · PDF
  112. Local Coherence or Global Validity? Investigating RLVR Traces in Math Domains

    Soumya Rani Samineni, Durgesh Kalwar, Vardaan Gangal, Siddhant Bhambri, Subbarao Kambhampati · PDF
  113. LOGCA: Layer-Optimized GPU-CPU Allocation for Efficient Resource Management in Large-Scale Models

    Zichen Song · PDF
  114. Logit–Entropy Adaptive Stopping Heuristic for Efficient Chain-of-Thought Reasoning

    Mohammad Atif Quamar, Mohammad Areeb · PDF
  115. Long-Context Modeling with Dynamic Hierarchical Sparse Attention for On-Device LLMs

    Siheng Xiong, Joe Zou, Faramarz Fekri, Yae Jee Cho · PDF
  116. LoRA-Guided PPO for Cost-Aware and Compute-Efficient Agent Orchestration

    Aneesh Durai, Joshua Cong Hu, Kevaan Buch, Kevin Zhu, Vasu Sharma, Aishwarya Balwani · PDF
  117. LSPO: Length-aware Dynamic Sampling for Policy Optimization in LLM Reasoning

    Weizhe Chen, Sven Koenig, Bistra Dilkina · PDF
  118. M-GRPO: Stabilizing Self-Supervised Reinforcement Learning for Large Language Models with Momentum-Anchored Policy Optimization

    Bizhe Bai, Hongming Wu, Peng Ye, Tao Chen · PDF
  119. M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

    Junxiong Wang, Wen-Ding Li, Daniele Paliotta, Daniel Ritter, Alexander M Rush, Tri Dao · PDF
  120. Mechanistic Interpretability of GPT-2: Lexical and Contextual Layers in Sentiment Analysis

    Amartya Hatua · PDF
  121. MetroRLHF: Enabling Memory-Effective Training for On-Policy RLHF via Adaptive Sequence Streaming

    Wei Cui · PDF
  122. Mimicking the Physicist's Eye : A VLM-centric Approach for Physics Formula Discovery

    Jiaqi Liu, Songning Lai, Pengze Li, Di Yu, Zhou wenjie, Yiyang Zhou, Peng Xia, Zijun Wang, Xi Chen, SHIXIANG TANG, LEI BAI, Wanli Ouyang, Mingyu Ding, Huaxiu Yao, Aoran Wang · PDF
  123. Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing

    Piotr Piękos, Róbert Csordás, Jürgen Schmidhuber · PDF
  124. MLM: Multi-linguistic LoRA Merging

    Jung Lee, Taero Kim, Nikhil Verma · PDF
  125. Mode-conditioning unlocks superior test-time compute scaling

    Chen Henry Wu, Sachin Goyal, Aditi Raghunathan · PDF
  126. Multi-Head Low-Rank Attention

    Songtao Liu, Hongwu Peng, Zhiwei Zhang, Zhengyu Chen, Yue Guo · PDF
  127. MultiGA: Leveraging Multi-Source Seeding in Genetic Algorithms

    Isabelle Diana May-Xin Ng, Tharindu Cyril Weerasooriya, Haitao Zhu, Wei Wei · PDF
  128. Multimodal Chain of Continuous Thought for Latent-Space Reasoning in Vision-Language Models

    Tan-Hanh Pham, Chris Ngo · PDF
  129. Muon: Training and Trade-offs with Latent Attention and MoE

    Sushant Mehta, Raj Dandekar, Rajat Dandekar, Sreedath Panat · PDF
  130. NaturalThoughts: Selecting and Distilling Reasoning Traces for General Reasoning Tasks

    Yang Li, Youssef Emad, Karthik Padthe, Jack Lanchantin, Weizhe Yuan, Thao Nguyen, Jason E Weston, Shang-Wen Li, Dong Wang, Ilia Kulikov, Xian Li · PDF
  131. Nested-ReFT: Efficient Reinforcement Learning for Large Language Model Fine-Tuning via Off-Policy Rollouts

    Maxime Heuillet, Yufei Cui, Boxing Chen, Audrey Durand, Prasanna Parthasarathi · PDF
  132. No Question, No Passage, No Problem: Investigating Artifact Exploitation and Reasoning in Multiple-Choice Reading Comprehension

    Anthony Cui, Rohan Raj Butani, Theodore Oltean · PDF
  133. Not All Bits Are Equal: Scale-Dependent Memory Optimization Strategies for Reasoning Models

    Junhyuck Kim, Ethan Ewer, Taehong Moon, Jongho Park, Dimitris Papailiopoulos · PDF
  134. Not All Thoughts Matter: Selective Attention for Efficient Reasoning

    Hao Tang, Guoqing Zheng, Kanishk Gandhi, Harkirat Behl, Vaishnavi Shrivastava, Mojan Javaheripi, Kevin Ellis, Shivam Garg, Dimitris Papailiopoulos · PDF
  135. OckBench: Tokens are Not to Be Multiplied without Necessity

    Zheng Du, Hao Kang, Song Han, Tushar Krishna, Ligeng Zhu · PDF
  136. Off-Trajectory Reasoning: Can LRMs Collaborate on Reasoning Trajectory?

    Aochong Oliver Li, Tanya Goyal · PDF
  137. On the Role of Temperature Sampling in Test-Time Scaling

    Yuheng Wu, Thierry Tambe · PDF
  138. On the Rollout-Training Mismatch in Modern RL Systems

    Feng Yao, Liyuan Liu, Dinghuai Zhang, Chengyu Dong, Jingbo Shang, Jianfeng Gao · PDF
  139. One Sample to Rule Them All: Extreme Data Efficiency in RL Scaling

    Yiyuan Li, Zhen Huang, Yanan Wu, Weixun Wang, Xuefeng Li, Yijia Luo, Pengfei Liu, Wenbo Su, Bo Zheng · PDF
  140. One-Pass to Reason: Token Duplication and Block-Sparse Mask for Efficient Fine-Tuning on Multi-Turn Reasoning

    Ritesh Goru, Shanay Mehta, Prateek Jain · PDF
  141. Optimal Self-Consistency for Efficient Reasoning with Large Language Models

    Austin Feng, Marius Alonso, Ambroise Odonnat, Vasilii Feofanov, Ievgen Redko · PDF
  142. OptimalThinkingBench: Evaluating Over and Underthinking in LLMs

    Pranjal Aggarwal, Seungone Kim, Jack Lanchantin, Sean Welleck, Jason E Weston, Ilia Kulikov, Swarnadeep Saha · PDF
  143. Optimizing Reasoning Efficiency through Prompt Difficulty Prediction

    Bo Zhao, Berkcan Kapusuzoglu, Kartik Balasubramaniam, Sambit Sahu, Supriyo Chakraborty, Genta Indra Winata · PDF
  144. ORPO-Distill: Mixed-Policy Preference Optimization for Cross-Architecture LLM Distillation

    Aasheesh Singh, Vishal Vaddina, Dagnachew Birru · PDF
  145. Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

    Tong Zheng, Hongming Zhang, Wenhao Yu, Xiaoyang Wang, He Xing, Runpeng Dai, Rui Liu, Huiwen Bao, Chengsong Huang, Heng Huang, Dong Yu · PDF
  146. Pay-Per-Search Models are Abstention Models

    Mustafa Omer Gul, Claire Cardie, Tanya Goyal · PDF
  147. Performative Thinking? The Brittle Correlation Between CoT Length and Problem Complexity

    Vardhan Palod, Karthik Valmeekam, Kaya Stechly, Subbarao Kambhampati · PDF
  148. PHLoRA: data-free Post-hoc Low-Rank Adapter extraction from full-rank checkpoint

    Bhoomit Vasani, Jack FitzGerald, Anjie Fang, Sushmit Vaish · PDF
  149. PosS:Position Specialist Generates Better Draft for Speculative Decoding

    Langlin Huang, Chengsong Huang, Jixuan Leng, Di Huang, Jiaxin Huang · PDF
  150. PREMISE: Scalable and Strategic Prompt Optimization for Efficient Mathematical Reasoning in Large Reasoning Models

    Ye Yu, Yaoning Yu, Haibo Jin, Haohan Wang · PDF
  151. Probe-Rewrite-Evaluate: A Workflow for Reliable Benchmarks and Quantifying Evaluation Awareness

    Lang Xiong, Nishant Bhargava, Jeremy Chang, Jianhang Hong, Haihao Liu, Vasu Sharma, Kevin Zhu · PDF
  152. ProofSketch: Efficient Verified Reasoning for Large Language Models

    Disha Sheshanarayana, Tanishka Magar · PDF
  153. ProRefine: Inference-time Prompt Refinement with Textual Feedback

    Deepak Pandita, Tharindu Cyril Weerasooriya, Ankit Shah, Isabelle Diana May-Xin Ng, Christopher M Homan, Wei Wei · PDF
  154. Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs?

    Haizhong Zheng, Jiawei Zhao, Beidi Chen · PDF
  155. ProtFunAgent: Agentic LLM Cascades for Low-Resource Protein Function Gap-Filling via Homology RAG and Ontology-Constrained Decoding

    Sajib Acharjee Dip, John S. Choy, Liqing Zhang · PDF
  156. Pull Requests with Bugs: Benchmarking Model Reasoning for Code Reviews

    Laurence Liang · PDF
  157. RaanA: A Fast, Flexible, and Data-Efficient Post-Training Quantization Algorithm

    Yongyi Yang, Jianyang Gao, Wei Hu · PDF
  158. RAT: Bridging RNN Efficiency and Attention Accuracy via Chunk-based Sequence Modeling

    Xiuying Wei, Anunay Yadav, Razvan Pascanu, Caglar Gulcehre · PDF
  159. Re-FORC: Adaptive Reward Prediction for Efficient Chain-of-Thought Reasoning

    Renos Zabounidis, Aditya Golatkar, Michael Kleinman, Alessandro Achille, Wei Xia, Stefano Soatto · PDF
  160. Reasoning Elicitation is Scale-Dependent

    Jake Ward · PDF
  161. Reasoning Models Better Express Their Confidence

    Dongkeun Yoon, Seungone Kim, Sohee Yang, Sunkyoung Kim, Soyeon Kim, Yongil Kim, Eunbi Choi, Yireun Kim, Minjoon Seo · PDF
  162. Reasoning Models Can Be Accurately Pruned via Chain-of-Thought Reconstruction

    Ryan Lucas, Kayhan Behdin, Zhipeng Wang, Qingquan Song, Shao Tang, Rahul Mazumder · PDF
  163. Reasoning Models Reason Inefficiently

    Dipika Khullar, Ashwinee Panda · PDF
  164. Reasoning Under Pressure: LLMs in Competitive Pokémon Battles

    Tadisetty Sai Yashwanth, Dhatri C · PDF
  165. Reasoning with Fewer Eyes: Efficient Visual Token Withdrawal for Multimodal Reasoning

    Andrea Ramazzina, Tobias Haab, David Fitzek, Stefano Gasperini, Jonas Uhrig, Mario Bijelic · PDF
  166. Reasoning-Focused Evaluation of Efficient Long-Context Inference Techniques

    Joie Zhang, Qiyao Wei, Howard Yen, Xi Ye, Danqi Chen · PDF
  167. Reasoning-Intensive Regression

    Diane Tchuindjo, Omar Khattab · PDF
  168. Reject Only Critical Tokens: Pivot-Aware Speculative Decoding

    Amir Ziashahabi, Yavuz Faruk Bakman, Duygu Nur Yaldiz, Mostafa El-Khamy, Sai Praneeth Karimireddy, Salman Avestimehr · PDF
  169. Resa: Transparent Reasoning Models via SAEs

    Shangshang Wang, Julian Asilis, Ömer Faruk Akgül, Enes Burak Bilgin, Ollie Liu, Deqing Fu, Willie Neiswanger · PDF
  170. Reuse, Don't Recompute: Efficient Large Reasoning Model Inference via Memory Orchestration

    Daivik Patel, Shrenik Patel · PDF
  171. Reversal Is Structural: Concept-Aware Post-Training Recovers Rare, Deep Mathematical Skills

    Yassir Laaouach · PDF
  172. RoiRL: Efficient, Self-Supervised Reasoning with Offline Iterative Reinforcement Learning

    Aleksei Arzhantsev, Otmane Sakhi, Flavian Vasile · PDF
  173. Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains

    Anisha Gunjal, Anthony Wang, Elaine Lau, Vaskar Nath, Yunzhong He, Bing Liu, Sean M. Hendryx · PDF
  174. Sample, Align, Synthesize: Graph-Based Response Synthesis with ConGrs

    Sayan Ghosh, Shahzaib Saqib Warraich, Dhruv Tarsadiya, Gregory Yauney, Swabha Swayamdipta · PDF
  175. SATBench: Benchmarking LLMs Logical Reasoning via Automated Puzzle Generation from SAT Formulas

    Anjiang Wei, Yuheng Wu, Yingjia Wan, Tarun Suresh, Huanmi Tan, Zhanke Zhou, Sanmi Koyejo, Ke Wang, Alex Aiken · PDF
  176. Scheherazade: Evaluating Chain-of-Thought Math Reasoning in LLMs with Chain-of-Problems

    Stephen Miner, Yoshiki Takashima, Simeng Han, Sam Kouteili, Ferhat Erata, Ruzica Piskac, Scott J Shapiro · PDF
  177. Scratchpad Thinking: Alternation Between Storage and Computation in Latent Reasoning Models

    Sayam Goyal, Brad Peters, María Emilia Granda, Akshath Vijayakumar Narmadha, Dharunish Yugeswardeenoo, Callum Stuart McDougall, Sean O'Brien, Ashwinee Panda, Kevin Zhu, Cole Blondin · PDF
  178. SeqFusion: Scalable Long-Context Reasoning through Parallel Fragment Fusion and Memory-Augmented Attention

    Yanxuan Yu, Dong Liu · PDF
  179. SGD-KV: Summarization Guided KV Cache Compression

    Zeyu Liu, Woomin Song, Xuandi Fu, Sai Muralidhar Jayanthi, Vivek Govindan, Aram Galstyan, Sravan Babu Bodapati, Srikanth Ronanki · PDF
  180. Short-to-Long Distillation: Learning Long-Context Policies from Short-Context Supervision

    Yuejiang Liu, Yuxi Qian, Yilun Du, Chelsea Finn · PDF
  181. SituationalPriv: A Context-Aware Framework for Privacy Detection and Protection in Vision-Language Models

    Zhaotian Weng, Haoxuan Li, Jieyu Zhao · PDF
  182. Software Engineering Agents for Embodied Controller Generation : A Study in Minigrid Environments

    Timothé Boulet, Xavier Hinaut, Clément Moulin-Frier · PDF
  183. SparseVILA-R1: Decoupling Visual Sparsity for Efficient VLM Reasoning

    Samir Khaki, Junxian Guo, Jiaming Tang, Shang Yang, Yukang Chen, Konstantinos N. Plataniotis, Yao Lu, Song Han, Zhijian Liu · PDF
  184. SpatialTraceGen: High-Fidelity Traces for Efficient VLM Spatial Reasoning Distillation

    Gio Huh, Dhruv Sheth, Rayhan Zirvi, Frank Xiao · PDF
  185. SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning

    Rui Pan, Yinwei Dai, Zhihao Zhang, Gabriele Oliaro, Zhihao Jia, Ravi Netravali · PDF
  186. SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models

    Chenyu Wang, Paria Rashidinejad, DiJia Su, Song Jiang, Sid Wang, Siyan Zhao, Cai Zhou, Shannon Zejiang Shen, Feiyu Chen, Tommi Jaakkola, Yuandong Tian, Bo Liu · PDF
  187. SRT: Accelerating Reinforcement Learning via Speculative Rollout with Tree-Structured Cache

    Chi-Chih Chang, Siqi Zhu, Zhichen Zeng, Haibin Lin, Xin Liu, Jiaxuan You, Mohamed S. Abdelfattah, Ziheng Jiang, Xuehai Qian · PDF
  188. Stable Reinforcement Learning for Efficient Reasoning

    Mz Dai, Shixuan Liu, Qingyi Si · PDF
  189. Statistical Early Stopping for Reasoning Models

    Yangxinyu Xie, Tao Wang, Soham Mallick, Yan Sun, Georgy Noarov, Mengxin Yu, Tanwi Mallick, Weijie J Su, Edgar Dobriban · PDF
  190. Superposition Reasoning Model

    Zheyang Xiong, Shivam Garg, Vaishnavi Shrivastava, Haoyu Zhao, Anastasios Kyrillidis, Dimitris Papailiopoulos · PDF
  191. SwiftSolve: A Self-Iterative, Complexity-Aware Multi-Agent Framework for Competitive Programming

    Jonas Rohweder, Adhyayan Veer Singh, Aaron Shen, Brian Law, Ahmed Ismail, Sean O'Brien, Kevin Zhu · PDF
  192. Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence

    Sean Michael McLeish, Ang Li, John Kirchenbauer, Dayal Singh Kalra, Brian R. Bartoldson, Bhavya Kailkhura, Avi Schwarzschild, Jonas Geiping, Micah Goldblum, Tom Goldstein · PDF
  193. The Conductor and the Engine: A Path Towards Co-Designed Reasoning

    Yuanxin Wang, Pawel Filipczuk, Anisha Garg, Amaan Dhada, Mohammad Hassanpour, David Bick, Ganesh Venkatesh · PDF
  194. The Effect of Dataset Diversification on Mathematical Problem Solving Performance

    Jason Yuan · PDF
  195. The Impact of Quantization on Large Reasoning Model Reinforcement Learning

    Medha Kumar, Zifei Xu, Xin Wang, Tristan J Webb · PDF
  196. The Path Not Taken: RLVR Provably Learns Off the Principals

    Hanqing Zhu, Zhenyu Zhang, Hanxian Huang, DiJia Su, Zechun Liu, Jiawei Zhao, Igor Fedorov, Hamed Pirsiavash, Jinwon Lee, David Z. Pan, Zhangyang Wang, Yuandong Tian, Kai Sheng Tai · PDF
  197. The Sequential Edge: Inverse-Entropy Voting Beats Parallel Self-Consistency at Matched Compute

    Aman Sharma, Paras Chopra · PDF
  198. The Virtues of Brevity: Avoid Overthinking in Parallel Test-Time Reasoning

    Raul Cavalcante Dinardi, Bruno Yamamoto, Anna Helena Reali Costa, Artur Jordao · PDF
  199. The Zero-Step Thinking: An Empirical Study of Mode Selection as Harder Early Exit in Reasoning Models

    Yuqiao Tan, Shizhu He, Kang Liu, Jun Zhao · PDF
  200. Think Straight, Stop Smart: Structured Reasoning for Efficient Multi-Hop RAG

    Jihwan Bang, Juntae Lee, Seunghan Yang, Sungha Choi · PDF
  201. ThinkBrake: Mitigating Overthinking in Tool Reasoning

    Minjae Oh, Sangjun Song, Seungkyu Lee, Sungmin Jo, Yohan Jo · PDF
  202. Thinking in Many Modes: How Composite Reasoning Elevates Large Language Model Performance with Limited Data

    Zishan Ahmad, Saisubramaniam Gopalakrishnan · PDF
  203. TimeAlign: Contamination-Aware Evaluation for Resource-Constrained Foundation Models

    Jasraj Budigam · PDF
  204. To See or To Read: User Behavior Reasoning in Multimodal LLMs

    Tianning Dong, Luyi Ma, Varun Vasudevan, Jason Cho, Sushant Kumar, Kannan Achan · PDF
  205. Towards a Mechanistic Understanding of Robustness in Finetuned Reasoning Models

    Aashiq Muhamed, Xuandong Zhao, Mona T. Diab, Virginia Smith, Dawn Song · PDF
  206. Towards Label-Free Biological Reasoning Synthetic Dataset Creation via Uncertainty Filtering

    Josefa Lia Stoisser, Lawrence Phillips, Aditya Misra, Tom A. Lamb, Philip Torr, Marc Boubnovski Martell, Julien Fauqueur, Kaspar Märtens · PDF
  207. Towards Quantifying Bias in Large Language Models

    Ali Nosratifiroozsalari, Alireza Afzal Aghaei, Ronald H Davies, Rajiv Ramnath · PDF
  208. TRACE: Transparent Reasoning and Attribution Chains for Extended Multimodal Contexts

    Adithya S Kolavi · PDF
  209. Training Dynamics Impact Quantization Degradation

    Albert Catalan-Tatjer, Niccolò Ajroldi, Jonas Geiping · PDF
  210. Uncovering Graph Reasoning in Decoder-only Transformers with Circuit Tracing

    Xinnan Dai, Chung-Hsiang Lo, Kai Guo, Shenglai Zeng, Dongsheng Luo, Jiliang Tang · PDF
  211. Understanding and Steering the Cognitive Behaviors of Reasoning Models at Test-Time

    Zhenyu Zhang, Xiaoxia Wu, Zhongzhu Zhou, Qingyang Wu, Yineng Zhang, Pragaash Ponnusamy, Harikaran Subbaraj, Jue WANG, Shuaiwen Leon Song, Ben Athiwaratkun · PDF
  212. UniFormer: Unified and Efficient Transformer for Reasoning Across General and Custom Computing

    Zhuoheng Ran, Chong Wu, Renjie Xu, Maolin Che, Hong Yan · PDF
  213. Universal Properties of Activation Sparsity in Modern Large Language Models

    Filip Szatkowski, Patryk Będkowski, Alessio Devoto, Jan Dubiński, Pasquale Minervini, Mikołaj Piórczyński, Simone Scardapane, Bartosz Wójcik · PDF
  214. Verbalized Algorithms

    Supriya Lall, Christian Farrell, Hari Pathanjaly, Marko Pavic, Sarvesh Chezhian, Masataro Asai · PDF
  215. What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT

    Yunzhen Feng, Julia Kempe, Cheng Zhang, Parag Jain, Anthony Hartshorn · PDF
  216. What’s Missing in Vision-Language Models? Probing Their Struggles with Causal Order Reasoning

    Zhaotian Weng, Haoxuan Li, Xin Eric Wang, Kuan-Hao Huang, Jieyu Zhao · PDF
  217. When Do Symbolic Solvers Enhance Reasoning in Large Language Models?

    Zhiyuan He, Dingmin Wang · PDF
  218. When Reasoning Meets Its Laws

    Junyu Zhang, Yifan Sun, Tianang Leng, Jingyan Shen, Liu Ziyin, Paul Pu Liang, Huan Zhang · PDF
  219. When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought

    Yiyang Zhou, Haoqin Tu, Zijun Wang, Zeyu Wang, Niklas Muennighoff, Fan Nie, Chaorui Deng, Shen Yan, Haoqi Fan, Yejin Choi, James Zou, Cihang Xie, Huaxiu Yao, Qinghao Ye · PDF
  220. Where do Reasoning Models make a Difference? Follow the Reasoning Leader for Efficient Decoding

    Ming Li, Tianyi Zhou · PDF
  221. Why GRPO Needs Normalization: A Local-Curvature Perspective on Adaptive Gradients

    Cheng Ge, Caitlyn Heqi Yin, Hao Liang, Jiawei Zhang · PDF
  222. Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLM

    Feng Hong, Geng Yu, Yushi Ye, Haicheng Huang, Huangjie Zheng, Ya Zhang, Yanfeng Wang, Jiangchao Yao · PDF
  223. WST: Weak-to-Strong Knowledge Transfer via Reinforcement Learning

    Haosen Ge, Shuo Li, Lianghuan Huang · PDF