ICML 2025 Past Other
Tiny Titans: The next wave of On-Device Learning for Foundational Models (TTODLer-FM)
TTODLer-FM @ ICML 2025
- Submission deadline
- May 27, 2025, 11:59 UTC imported from OpenReview — check the website for extensions
- Submission portal
- OpenReview
- Notes
- Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).
Accepted papers (32)
Fetched from OpenReview (v2) on 2026-06-10.
-
Addition is almost all you need: Compressing neural networks with double binary factorization
-
Capability Transfer from Large to Small Models with Synthetically-Generated Data
-
Compression of Large Language Models by Condensed Weight Representation
-
DiffusionBlocks: Blockwise Training for Generative Models via Score-Based Diffusion
-
Dynamic Guardian Models: Realtime Content Moderation With User-Defined Policies
-
Enhancing Reasoning Capabilities of Small Language Models with Blueprints and Prompt Template Search
-
FAST: Federated Active Learning with Foundation Models for Communication-efficient Sampling and Training
-
FGFP: A Fractional Gaussian Filter and Pruning for Deep Neural Networks Compression
-
First Provable Guarantees for Practical Private FL: Beyond Restrictive Assumptions
-
FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training
-
Gatekeeper: Improving Model Cascades Through Confidence Tuning
-
Higher Acceptance Rates for Speculative Decoding with Randomised Drafting
-
Kinetics: Rethinking Test-Time Scaling Laws
-
Leveraging Coordinate Momentum in SignSGD and Muon: Memory-Optimized Zero-Order LLM Fine-Tuning
-
Lion Cub: Minimizing Communication Overhead in Distributed Lion
-
LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning
-
MatMuls are Enough for Efficient and Performant Linear-Time Attention
-
Offloaded Reasoning: Efficient Inference for Large Language Models via Modular Reasoning and Refinement
-
Overcoming label shift in targeted federated learning
-
Predictive Scheduling for Efficient Inference-Time Reasoning in Large Language Models
-
Preserve then Quantize: Dominant-Subspace Guided Low-Rank Reconstruction
-
Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation for Federated Learning
-
SPAM: Stochastic Proximal Point Method with Momentum Variance Reduction for Non-convex Cross-Device Federated Learning
-
Spec-LLaVA: Accelerating Vision-Language Models with Dynamic Tree-Based Speculative Decoding
-
TensorSLM: Energy-efficient Embedding Compression of Sub-billion Parameter Language Models on Low-end Devices
-
Token-Efficient RL for LLM Reasoning
-
Too Big to Think: Capacity, Memorization, and Generalization in Pre-Trained Transformers
-
Towards understanding of orthogonalization in Muon
-
Unlocking the Potential of Extremely Low-Bit Sparse Transformers through Adaptive Multi-bit Supermasks and Random Weights
-
WhisperKit: On-device Real-time ASR with Billion-Scale Transformers
-
Zeroth-Order Optimization is Secretly Single-Step Policy Optimization
-
Zoop it! Efficient Zero-Order Optimization with Output Perturbation