NeurIPS 2025 Past ML systems
Machine Learning for Systems 2025
MLForSys2025
- Submission deadline
- Aug 30, 2025, 11:59 UTC imported from OpenReview — check the website for extensions
- Submission portal
- OpenReview
- Notes
- Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).
Accepted papers (41)
Fetched from OpenReview (v2) on 2026-06-10.
-
A Data-driven ML Approach for Maximizing Performance in LLM-Adapter Serving
-
A Joint Learning Approach to Hardware Caching and Prefetching
-
Advancing Routing-Awareness in Analog ICs Floorplanning
-
Adversarial Query Synthesis via Bayesian Optimization
-
Agentic Bridge Framework: Closing the Gap Between Agentic Capability and Performance Benchmarks
-
An Early Exploration of Deep-Learning-Driven Prefetching for Far Memory
-
An Expert in Residence: LLM Agents for Always-On Operating System Tuning
-
APCE: Adaptive Progressive Context Expansion for Long Context Processing
-
ASAP: an Agentic Solution to Auto-optimize Performance of Large-Scale LLM Training
-
Attention-Informed Surrogates for Navigating Power-Performance Trade-offs in HPC
-
Automated Multi-Agent Workflows for RTL Design
-
Carbon-Aware RL-LLM Control for Energy-Efficient Liquid-Cooled HPC Data Centers
-
DataSwift: Smart Choices for Safe Query Optimization
-
Forecasting machine degradation of GPU Clusters
-
GraphFaaS: Serverless GNN Inference for Burst-Resilient, Real-Time Intrusion Detection
-
How Should We Evaluate Data Deletion in Graph-Based ANN Indexes?
-
InfraGym: Empowering LLM Agents for Real-World Computer System Optimization
-
Learning to Shard: RL for Co-optimizing the Parallelism Degrees and Per-operator Sharding Dimensions in Distributed LLM Inference
-
Leveraging Large Language Models to Enhance Machine-Learning-Driven HPC Job Scheduling
-
LLM-Box : An Agentic Framework for Guided Black-Box Optimization in Mapping LLMs onto Specialized Hardware Accelerators
-
LLM-Guided Autoscheduling for Large-Scale Sparse Machine Learning
-
LLMVisor: A Real-Time Latency Attribution Model for Multi-Tenant LLM Serving
-
Mind the Gap: Time-of-Check to Time-of-Use Vulnerabilities in LLM-Enabled Agents
-
ML-Guided Cold Plate Design and Thermal Analysis for Liquid-Cooled HPC Servers
-
MoE-GPS: Guidlines for Prediction Strategy with Expert Duplication in MoE Load Balancing
-
MXNorm: Reusing block scales for efficient tensor normalisation
-
NetGent : Agent-Based Automation of Network Application Workflows
-
NeuSym-HLS: Learning-Driven Symbolic Distillation in High-Level Synthesis of Hardware Accelerators
-
Optimized Learned Count-Min Sketch
-
OptRot: Mitigating Weight Outliers via Data-Free Rotations for Post-Training Quantization
-
PORT: Efficient Training-Free Online Routing for High-Volume Multi-LLM Serving
-
QAQ: Query-adaptive Mixed-precision Quantization for Large Language Models
-
Retrieval on Verilog Repositories: A Knowledge-Graph Based Solution
-
Small Language Models as Compiler Experts: Auto-Parallelization for Heterogeneous Systems
-
Small, Fast, and Certain: Developing a Specialized Verilog Code Completion Solution for the Enterprise
-
Sustainable Control of Geo-Distributed Datacenters by Distilling Numerical Experts into Adaptive LLM Agents
-
SwizzlePerf: Hardware-Aware LLMs for GPU Kernel Performance Optimization
-
Towards Agentic OS: An LLM Agent Framework for Linux Schedulers
-
Towards Automatically Optimizing Retrieval Augmented AI Systems
-
Ultra-Efficient Decoding for End-to-End Neural Compression and Reconstruction
-
When to Reason: Semantic Router for vLLM