NeurIPS 2025 Past Math & reasoningEfficiency
NeurIPS 2025 Workshop on Efficient Reasoning
NeurIPS 2025 ER Workshop
- Submission deadline
- Oct 2, 2025, 11:59 UTC imported from OpenReview — check the website for extensions
- Submission portal
- OpenReview
- Notes
- Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).
Accepted papers (223)
Fetched from OpenReview (v2) on 2026-06-10.
-
A Cooperation Index for Model Pruning
-
A Simple "Motivation" Can Enhance Reinforcement Finetuning of Large Reasoning Models
-
Activation Steering for Chain-of-Thought Compression
-
Active Inference Control: Steering, Not Just Scaling, Language Model Reasoning
-
AdaptDistill: Improving Small Language Models with Skill-Aware Teaching
-
AdaptInfer: Adaptive Token Pruning for Vision–Language Model Inference with Dynamical Text Guidance
-
Adaptive Dual Reasoner: Large Reasoning Models Can Think Efficiently by Hybrid Reasoning
-
Adaptive Guidance Accelerates Reinforcement Learning of Reasoning Models
-
Agentic NL2SQL to Reduce Computational Costs
-
AGENTIQL: An Agent-Inspired Multi-Expert Framework for Text-to-SQL Generation
-
Amortized Latent Steering: Low-Cost Alternative to Test-Time Optimization
-
An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in Multimodal Reasoning Models
-
Analysis of Emergence of Reasoning in Language Models: Factors, Thresholds and Interpretations
-
Are We Scaling the Right Thing? A System Perspective on Test-Time Scaling
-
ARM: Adaptive Reasoning Model
-
Attention Guided Alignment in Efficient Vision-Language Models
-
AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models
-
Bayesian Social Deduction with Graph-Informed Language Models
-
BenchHub: A Unified Benchmark Suite for Holistic and Customizable LLM Evaluation
-
Beyond Static Cutoffs: One-Shot Dynamic Thresholding for Diffusion Language Models
-
Boundary Guidance for Efficient 3D CT Vision–Language Reasoning
-
Breadcrumbs Reasoning: Memory-Efficient Reasoning with Compression Beacons
-
Budget-aware Test-time Scaling via Discriminative Verification
-
Calibrated Reasoning: An Explanatory Verifier for Dynamic and Efficient Problem-Solving
-
Can Explanations Improve Recommendations? A Joint Optimization with LLM Reasoning
-
CaRT: Teaching LLM Agents to Know When They Know Enough
-
CATS: Category-Aware Token-level Steering for Training-Free Redundancy Reduction in Large Reasoning Models
-
Causal Reflection with Language Models
-
CGES: Confidence-Guided Early Stopping for Efficient and Accurate Self-Consistency
-
Chopping Trees: Semantic Similarity Based Dynamic Pruning for Tree-of-Thought Reasoning
-
Coevolutionary Continuous Discrete Diffusion: Make Your Diffusion Language Model a Latent Reasoner
-
COMPACT: Common-token Optimized Model Pruning Across Channels and Tokens
-
Compute as Teacher: Turning Inference Compute Into Reference-Free Supervision
-
Compute When Worth It: Risk Control for Reasoning on a Compute Budget
-
Confidence-Coverage Gating for Early Exit
-
ConstrainedSQL: Training LLMs for Text2SQL via Constrained Reinforcement Learning
-
Correct Reasoning Paths Visit Shared Decision Pivots
-
DA-CoTD: Efficient Chain-of-Thought Reasoning with Difficulty-Aware CoT-Distillation
-
DAG-Math: Graph-Guided Mathematical Reasoning in LLMs
-
Data Diversification Methods In Alignment Enhance Math Performance In LLMs
-
Data Scaling Isn't Enough: Towards Improving Compositional Reasoning in Video-Language Models
-
Decomposing Reasoning Efficiency in Large Language Models
-
Deep Think with Confidence
-
Delta Activations: A Representation for Finetuned Large Language Models
-
Demystifying and Enhancing the Efficiency of Interleaved Reasoning-Search LLM Agents
-
Demystifying Delays in Reasoning: A Pilot Temporal and Token Analysis of Reasoning Systems
-
DHP: Discrete Hierarchical Planning for HRL Agents
-
DiFFPO: Training Diffusion LLMs to Reason Fast and Furious via Reinforcement Learning
-
Diffusion Language Models Know the Answer Before Decoding
-
DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data
-
Distilling Multi-modal Large Language Models for Autonomous Driving
-
DIVERSED: Relaxed Speculative Decoding via Dynamic Ensemble Verification
-
DMORE: Differentiable Mixture-of-Reasoning-Experts with Uncertainty-Guided Multi-Level Routing
-
Don’t Let It Hallucinate: Premise Verification via Retrieval-Augmented Logical Reasoning
-
DRPO: Efficient Reasoning via Decoupled Reward Policy Optimization
-
DTS: Enhancing Large Reasoning Models via Decoding Tree Sketching
-
Dual-Head Reasoning Distillation: Improving Classifier Accuracy with Train-Time-Only Reasoning
-
e1: Learning Adaptive Control of Reasoning Effort
-
EcoSpa: Efficient Transformer Training with Coupled Sparsity
-
Efficient Long CoT Reasoning in Small Language Models
-
Efficient Parallel Samplers for Recurrent-Depth Models and Their Connections to Diffusion Language Models
-
Efficient Post-Training for Industry-Specialized Reasoning in Small Language Models
-
Efficient Reasoning at Fixed Test-Time Cost via Length-Aware Attention Priors and Gain-Aware Training
-
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
-
Efficient Reinforcement Learning for Large Language Models with Intrinsic Exploration
-
Efficient RL Training for Reasoning Models via Length-Aware Optimization
-
Efficient Sparse Decoding for Test-Time Scaling with KV Cache Disaggregation and Asynchronism
-
Efficient Test-Time Scaling via Self-Calibration
-
Episode-Level Multimodal KV Caching for Embodied Question Answering
-
Evaluating the Safety and Skill Reasoning of Large Reasoning Models Under Compute Constraints
-
EWoRA: Expert Weighted Low-Rank Adaptation for Reasoning over Heterogeneous Data
-
Extending AutoCompressors via Surprisal-Based Dynamic Segmentation
-
Feature-Level Knowledge Distillation from LMM for Enhanced Image Classification
-
Federation of Agents: A Semantics-Aware Communication Fabric for Large-Scale Agentic AI
-
Finding the Sweet Spot: Trading Quality, Cost, and Speed During Inference-Time LLM Reflection
-
Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute
-
Free Draft-and-Verification: Toward Lossless Parallel Decoding for Diffusion Large Language Models
-
From Evidence to Trajectory: Abductive Reasoning Path Synthesis for Retrieval-Augmented Generation Agents Development
-
From Long to Short: LLMs Excel at Trimming Own Reasoning Chains
-
FrugalRAG: Learning to retrieve and reason for multi-hop QA
-
GEAR-X: Expanders for Next-Gen KV Cache Compression
-
Generalized Parallel Scaling with Interdependent Generations
-
Generating Domain Specific Natural Language SAT Reasoning Datasets
-
Hard Examples Are All You Need: Maximizing GRPO Post-Training Under Annotation Budgets
-
Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning
-
Hierarchical Planning Agent for Web-Browsing Tasks
-
Hold Onto That Thought: Assessing KV Cache Compression On Reasoning
-
How Does RL Induce Skill Composition? A Case Study Using Countdown
-
How Far Can SLMs Go Without `Thinking' in the LLM-as-a-Judge Paradigm?
-
How Weight Pruning Destroys Chain-of-Thought Reasoning in Language Reasoning Models: A Model Similarity and Faithfulness Correlation Analysis
-
HybridCoT: Interleaving Latent and Text Chain-of-Thought for Efficient Reasoning
-
Hydra: A Modular Architecture for Efficient Long-Context Reasoning
-
Improving LLM Reasoning under Uncertainty with Coach-Player Multi-agent
-
In Good GRACEs: Principled Teacher Selection for Knowledge Distillation
-
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use
-
Inference-Time Chain-of-Thought Pruning with Latent Informativeness Signals
-
Influence Functions for Efficient Data Selection in Reasoning
-
Information-Theoretic Bounds on Multi-Step Reasoning: When is Chain-of-Thought Provably Necessary?
-
Inpainting-Guided Policy Optimization for Diffusion Large Language Models
-
Instance-Adaptive Inference-Time Scaling with Calibrated Process Reward Models
-
Internal Value Functions: Leveraging Hidden States for Efficient Test-Time Scaling in Large Reasoning Models
-
iOS as Acceleration
-
It Takes Two: Your GRPO Is Secretly DPO
-
Just Enough Thinking: Efficient Reasoning with Adaptive Length Penalties Reinforcement Learning
-
Lark: Biologically Inspired Neuroevolution for Multi-Stakeholder LLM Agents
-
LayerMerge: Modality-Agnostic Depth Pruning for Efficient Foundation Model Deployment
-
Learnable Adaptive KV-cache Compression
-
Learning to Reason Across Parallel Samples for LLM Reasoning
-
Learning to Reason via Mixture-of-Thought for Logical Reasoning
-
Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training
-
Less is Not Worse: Effective Reasoning Without Complete Reasoning Chains
-
Local Coherence or Global Validity? Investigating RLVR Traces in Math Domains
-
LOGCA: Layer-Optimized GPU-CPU Allocation for Efficient Resource Management in Large-Scale Models
-
Logit–Entropy Adaptive Stopping Heuristic for Efficient Chain-of-Thought Reasoning
-
Long-Context Modeling with Dynamic Hierarchical Sparse Attention for On-Device LLMs
-
LoRA-Guided PPO for Cost-Aware and Compute-Efficient Agent Orchestration
-
LSPO: Length-aware Dynamic Sampling for Policy Optimization in LLM Reasoning
-
M-GRPO: Stabilizing Self-Supervised Reinforcement Learning for Large Language Models with Momentum-Anchored Policy Optimization
-
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
-
Mechanistic Interpretability of GPT-2: Lexical and Contextual Layers in Sentiment Analysis
-
MetroRLHF: Enabling Memory-Effective Training for On-Policy RLHF via Adaptive Sequence Streaming
-
Mimicking the Physicist's Eye : A VLM-centric Approach for Physics Formula Discovery
-
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing
-
MLM: Multi-linguistic LoRA Merging
-
Mode-conditioning unlocks superior test-time compute scaling
-
Multi-Head Low-Rank Attention
-
MultiGA: Leveraging Multi-Source Seeding in Genetic Algorithms
-
Multimodal Chain of Continuous Thought for Latent-Space Reasoning in Vision-Language Models
-
Muon: Training and Trade-offs with Latent Attention and MoE
-
NaturalThoughts: Selecting and Distilling Reasoning Traces for General Reasoning Tasks
-
Nested-ReFT: Efficient Reinforcement Learning for Large Language Model Fine-Tuning via Off-Policy Rollouts
-
No Question, No Passage, No Problem: Investigating Artifact Exploitation and Reasoning in Multiple-Choice Reading Comprehension
-
Not All Bits Are Equal: Scale-Dependent Memory Optimization Strategies for Reasoning Models
-
Not All Thoughts Matter: Selective Attention for Efficient Reasoning
-
OckBench: Tokens are Not to Be Multiplied without Necessity
-
Off-Trajectory Reasoning: Can LRMs Collaborate on Reasoning Trajectory?
-
On the Role of Temperature Sampling in Test-Time Scaling
-
On the Rollout-Training Mismatch in Modern RL Systems
-
One Sample to Rule Them All: Extreme Data Efficiency in RL Scaling
-
One-Pass to Reason: Token Duplication and Block-Sparse Mask for Efficient Fine-Tuning on Multi-Turn Reasoning
-
Optimal Self-Consistency for Efficient Reasoning with Large Language Models
-
OptimalThinkingBench: Evaluating Over and Underthinking in LLMs
-
Optimizing Reasoning Efficiency through Prompt Difficulty Prediction
-
ORPO-Distill: Mixed-Policy Preference Optimization for Cross-Architecture LLM Distillation
-
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
-
Pay-Per-Search Models are Abstention Models
-
Performative Thinking? The Brittle Correlation Between CoT Length and Problem Complexity
-
PHLoRA: data-free Post-hoc Low-Rank Adapter extraction from full-rank checkpoint
-
PosS:Position Specialist Generates Better Draft for Speculative Decoding
-
PREMISE: Scalable and Strategic Prompt Optimization for Efficient Mathematical Reasoning in Large Reasoning Models
-
Probe-Rewrite-Evaluate: A Workflow for Reliable Benchmarks and Quantifying Evaluation Awareness
-
ProofSketch: Efficient Verified Reasoning for Large Language Models
-
ProRefine: Inference-time Prompt Refinement with Textual Feedback
-
Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs?
-
ProtFunAgent: Agentic LLM Cascades for Low-Resource Protein Function Gap-Filling via Homology RAG and Ontology-Constrained Decoding
-
Pull Requests with Bugs: Benchmarking Model Reasoning for Code Reviews
-
RaanA: A Fast, Flexible, and Data-Efficient Post-Training Quantization Algorithm
-
RAT: Bridging RNN Efficiency and Attention Accuracy via Chunk-based Sequence Modeling
-
Re-FORC: Adaptive Reward Prediction for Efficient Chain-of-Thought Reasoning
-
Reasoning Elicitation is Scale-Dependent
-
Reasoning Models Better Express Their Confidence
-
Reasoning Models Can Be Accurately Pruned via Chain-of-Thought Reconstruction
-
Reasoning Models Reason Inefficiently
-
Reasoning Under Pressure: LLMs in Competitive Pokémon Battles
-
Reasoning with Fewer Eyes: Efficient Visual Token Withdrawal for Multimodal Reasoning
-
Reasoning-Focused Evaluation of Efficient Long-Context Inference Techniques
-
Reasoning-Intensive Regression
-
Reject Only Critical Tokens: Pivot-Aware Speculative Decoding
-
Resa: Transparent Reasoning Models via SAEs
-
Reuse, Don't Recompute: Efficient Large Reasoning Model Inference via Memory Orchestration
-
Reversal Is Structural: Concept-Aware Post-Training Recovers Rare, Deep Mathematical Skills
-
RoiRL: Efficient, Self-Supervised Reasoning with Offline Iterative Reinforcement Learning
-
Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains
-
Sample, Align, Synthesize: Graph-Based Response Synthesis with ConGrs
-
SATBench: Benchmarking LLMs Logical Reasoning via Automated Puzzle Generation from SAT Formulas
-
Scheherazade: Evaluating Chain-of-Thought Math Reasoning in LLMs with Chain-of-Problems
-
Scratchpad Thinking: Alternation Between Storage and Computation in Latent Reasoning Models
-
SeqFusion: Scalable Long-Context Reasoning through Parallel Fragment Fusion and Memory-Augmented Attention
-
SGD-KV: Summarization Guided KV Cache Compression
-
Short-to-Long Distillation: Learning Long-Context Policies from Short-Context Supervision
-
SituationalPriv: A Context-Aware Framework for Privacy Detection and Protection in Vision-Language Models
-
Software Engineering Agents for Embodied Controller Generation : A Study in Minigrid Environments
-
SparseVILA-R1: Decoupling Visual Sparsity for Efficient VLM Reasoning
-
SpatialTraceGen: High-Fidelity Traces for Efficient VLM Spatial Reasoning Distillation
-
SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning
-
SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models
-
SRT: Accelerating Reinforcement Learning via Speculative Rollout with Tree-Structured Cache
-
Stable Reinforcement Learning for Efficient Reasoning
-
Statistical Early Stopping for Reasoning Models
-
Superposition Reasoning Model
-
SwiftSolve: A Self-Iterative, Complexity-Aware Multi-Agent Framework for Competitive Programming
-
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
-
The Conductor and the Engine: A Path Towards Co-Designed Reasoning
-
The Effect of Dataset Diversification on Mathematical Problem Solving Performance
-
The Impact of Quantization on Large Reasoning Model Reinforcement Learning
-
The Path Not Taken: RLVR Provably Learns Off the Principals
-
The Sequential Edge: Inverse-Entropy Voting Beats Parallel Self-Consistency at Matched Compute
-
The Virtues of Brevity: Avoid Overthinking in Parallel Test-Time Reasoning
-
The Zero-Step Thinking: An Empirical Study of Mode Selection as Harder Early Exit in Reasoning Models
-
Think Straight, Stop Smart: Structured Reasoning for Efficient Multi-Hop RAG
-
ThinkBrake: Mitigating Overthinking in Tool Reasoning
-
Thinking in Many Modes: How Composite Reasoning Elevates Large Language Model Performance with Limited Data
-
TimeAlign: Contamination-Aware Evaluation for Resource-Constrained Foundation Models
-
To See or To Read: User Behavior Reasoning in Multimodal LLMs
-
Towards a Mechanistic Understanding of Robustness in Finetuned Reasoning Models
-
Towards Label-Free Biological Reasoning Synthetic Dataset Creation via Uncertainty Filtering
-
Towards Quantifying Bias in Large Language Models
-
TRACE: Transparent Reasoning and Attribution Chains for Extended Multimodal Contexts
-
Training Dynamics Impact Quantization Degradation
-
Uncovering Graph Reasoning in Decoder-only Transformers with Circuit Tracing
-
Understanding and Steering the Cognitive Behaviors of Reasoning Models at Test-Time
-
UniFormer: Unified and Efficient Transformer for Reasoning Across General and Custom Computing
-
Universal Properties of Activation Sparsity in Modern Large Language Models
-
Verbalized Algorithms
-
What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT
-
What’s Missing in Vision-Language Models? Probing Their Struggles with Causal Order Reasoning
-
When Do Symbolic Solvers Enhance Reasoning in Large Language Models?
-
When Reasoning Meets Its Laws
-
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought
-
Where do Reasoning Models make a Difference? Follow the Reasoning Leader for Efficient Decoding
-
Why GRPO Needs Normalization: A Local-Curvature Perspective on Adaptive Gradients
-
Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLM
-
WST: Weak-to-Strong Knowledge Transfer via Reinforcement Learning