ICLR 2026 Past Other

Catch, Adapt, and Operate: Monitoring ML Models Under Drift Workshop

CAO

Submission deadline
Feb 11, 2026, 13:01 UTC
imported from OpenReview — check the website for extensions
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (74)

Fetched from OpenReview (v2) on 2026-06-10.

  1. A Credal-Set Perspective on Task-Induced Distributional Drift in Text Generation

    Esteban Garces Arias · PDF
  2. A Geometry-Based View of Mahalanobis OOD Detection

    Denis Janiak, Jakub Binkowski, Tomasz Jan Kajdanowicz · PDF
  3. Adaptive Quasimetric Mapping : Principled Topological Abstraction for Robust Offline Goal-Conditioned Navigation

    Anthony Kobanda, Waris Radji, Odalric-Ambrym Maillard, Rémy Portelas · PDF
  4. Approximating Function Space Distance for Continual Learning in Transformers

    Nikita Dhawan, Felix Dangel, Roger Baker Grosse · PDF
  5. Beyond Accuracy: Evaluating Visual Grounding in Multimodal Medical Reasoning

    Anas Zafar, Leema Krishna Murali, Ashish Vashist · PDF
  6. CAdam: Confidence-Based Optimization for Online Learning

    Shaowen Wang, ANAN LIU, Jian Xiao, Yuekui Yang, Huan Liu, Suncong Zheng, Wei Zhang, Cong Xu, Di Wang, Huan Yu, Jie Jiang, Jian Li · PDF
  7. Can Linear Probes Effectively Measure LLM Uncertainty ?

    Ramzi Dakhmouche, Adrien Letellier, Hossein Gorji · PDF
  8. CAO-LLM: Catching, Adapting and Operating Under Distribution Drift for Large Language Models

    Nitin Vetcha · PDF
  9. Capacity and Redundancy Trade-offs in Multi-Task Learning

    Asif Khan · PDF
  10. CATS: Conformalized Adaptive Test-Time Scaling

    Mohammad Sadegh Akhondzadeh, Soroush H. Zargarbashi, Simone Antonelli, Aleksandar Bojchevski · PDF
  11. Compress to Impress: Efficient LLM Adaptation Using a Single Gradient Step on 100 Samples

    Shiva Sreeram, Alaa Maalouf, Pratyusha Sharma, Daniela Rus · PDF
  12. CROSS-LINGUAL FAIRNESS DRIFT IN LLM MORAL REASONING

    Ethan Xie, Aidan Chang-Lee, Avyukth Harish, Archana Vaidheeswaran · PDF
  13. Detecting Distributional Drift in Transformers Through Representation Dynamics

    Aakash Patil, Mrunmayee Shende · PDF
  14. DISCO: Diversifying Sample Condensation for Efficient Model Evaluation

    Alexander Rubinstein, Benjamin Raible, Martin Gubri, Seong Joon Oh · PDF
  15. Drift ≠ Error: Reliability Analysis of Agricultural Foundation Models Under Distribution Shift

    Shayan Nejadshamsi, Vahab Khoshdel, Brock Porth, Shadi Zaki, Yuanyuan Zhang, Lysa Porth · PDF
  16. Drift-Aware Uncertainty Quantification via a Functional Spectral-Newton Method

    Thiago Ramos, Alek Fröhlich, Daniel Perazzo, Massimiliano Pontil · PDF
  17. Drift-to-Action Controllers: Budgeted Interventions with Online Risk Certificates

    Ismail Lamaakal, Chaymae Yahyati, Khalid El Makkaoui, Ibrahim Ouahbi, Yassine Maleh · PDF
  18. Duration Aware Scheduling for ASR Serving Under Workload Drift

    Darshan Makwana, Yash Jogi, Harsh Kotta, Aayush Kubba · PDF
  19. Efficient Dataset Selection for Continual Adaptation of Generative Recommenders

    Cathy Jiao, Juan Elenter, Praveen Chandar, Bernd Huber, Joseph Cauteruccio, Todd Wasson, Timothy Christopher Heath, Chenyan Xiong, Mounia Lalmas, Paul N. Bennett · PDF
  20. Emergent Misalignment: Tracking the Emergence and Evolution of Misaligned traits throughout Model Training

    Geunwoo Park, Pranay Chauhan, Haihao Liu · PDF
  21. Evaluating Domain-Shift Generalization of Liquid Neural Networks in Autonomous Driving

    Mihaela-Larisa Clement, Mónika Farsang, Mihai-Teodor Stanusoiu, Ramin Hasani, Daniela Rus, Radu Grosu, Ezio Bartocci · PDF
  22. Evaluating Performance Drift from Model Switching in Multi-Turn LLM Systems

    Raad Khraishi, Iman Zafar, Katie Myles, Greig A Cowan · PDF
  23. Evi-BALD: Bayesian Active Learning by Disagreement via Evidential Deep Learning

    Minghao Li, Weishi Shi · PDF
  24. Explainability of predictive uncertainty models under drift in the telecom domain

    Nagesh Walchatwar, Alberto Hata, Ajay Kattepur · PDF
  25. FedAgree: Leveraging Federated Checkpoints for Label-Free OOD Evaluation via Agreement

    Giuseppe Serra, Ben Werner, Florian Buettner · PDF
  26. Hidden-Layer Self-Distillation Yields Drift-Resilient Visual Representations

    Scott C. Lowe, Anthony Fuller, Sageev Oore, Graham W. Taylor, Evan Shelhamer · PDF
  27. Hindsight-Anchored Policy Optimization: Turning Failure into Feedback in Sparse Reward Settings

    Yuning Wu, Ke Wang, Devin Chen, Kai Wei · PDF
  28. Hyperspherical Filtering for Online Classification under Drift

    David Boekestijn, Mona Schirmer · PDF
  29. In-Context Adaptation

    Yongqiang Chen, Chenxi Liu, Qingyi Guo, Bo Han, Kun Zhang · PDF
  30. Layer by layer, module by module: Choose both for optimal OOD probing of ViT

    Ambroise Odonnat, Vasilii Feofanov, Laetitia Chapel, Romain Tavenard, Ievgen Redko · PDF
  31. Lifting the Veil of Non-Stationarity in Financial Market

    Vincent Fu, Xinxin Xu, Weichen Xu, Ruilong Ren, Bowen Deng, Xinyu Zhao, Jian Cao, Xixin Cao · PDF
  32. Localized Dynamics-Aware Domain Adaption for Off-Dynamics Offline Reinforcement Learning

    Zhangjie Xia, Yu Yang, Pan Xu · PDF
  33. Locally Adaptive Multi-Objective Learning

    Jivat Neet Kaur, Isaac Gibbs, Michael I. Jordan · PDF
  34. LogitScope: A Framework for Analyzing LLM Uncertainty Through Information Metrics

    Farhan Ahmed, Yuya Jeremy Ong, Chad DeLuca · PDF
  35. LookSharp: Attention Entropy Minimization for Test-Time Adaptation

    Yash Mali, Evan Shelhamer · PDF
  36. Loss Smoothing for Continual Adaptation

    Darshan Patil, Ekaterina Lobacheva, Razvan Pascanu, Sarath Chandar · PDF
  37. Manifold-Aware Temporal Domain Generalization for Large Language Models

    Yiheng YAO, Zekun Cai, Xinyuan Song, Hiroki Hill Kobayashi, Xuan Song, Ryosuke Shibasaki, Liang Zhao · PDF
  38. Measuring Control Intervention Awareness Across Frontier LLMs

    Joachim Schaeffer, Thomas Jiralerspong, Alexander Panfilov, Roland S. Zimmermann · PDF
  39. Network System Forecasting Despite Topology Shift

    Ramzi Dakhmouche, Ivan Lunati, Hossein Gorji · PDF
  40. Noise-Response Calibration: A Causal Intervention Protocol for LLM-Judges

    Maxim Khomiakov, Jes Frellsen · PDF
  41. Not All Clients Are Equal: Collaborative Model Personalization on Heterogeneous Multi-Modal Clients

    Minhyuk Seo, Taeheon Kim, Hankook Lee, Jonghyun Choi, Tinne Tuytelaars · PDF
  42. Not All Queries Need Rewriting: When Prompt-Only LLM Refinement Helps and Hurts Dense Retrieval

    Varun Kotte · PDF
  43. Noticing the Watcher: LLM Agents Can Infer CoT Monitoring from Blocking Feedback

    Thomas Jiralerspong, Flemming Kondrup, Yoshua Bengio · PDF
  44. OASIS: Online Sample Selection for Continual Instruction Tuning

    Minjae Lee, Minhyuk Seo, Tingyu Qu, Tinne Tuytelaars, Jonghyun Choi · PDF
  45. On the Identifiability of Steering Vectors in Large Language Models

    Sohan Venkatesh, Ashish Mahendran Kurapath · PDF
  46. Online Fine-Tuning of Pretrained Controllers for Autonomous Driving via Real-Time Recurrent RL

    Julian Lemmel, Felix Resch, Mónika Farsang, Ramin Hasani, Daniela Rus, Radu Grosu · PDF
  47. Out-of-Support Generalisation via Weight-Space Sequence Modelling

    Roussel Desmond Nzoyem · PDF
  48. Paranoid Monitors: How Long Context Breaks LLM Agent Supervision

    Alicia Yang, Aashiq Muhamed, Mona T. Diab, Virginia Smith · PDF
  49. PEFT-Arena: Understanding Parameter-Efficient Finetuning from a Stability-Plasticity Perspective

    Yangyi Huang, Ruotian Peng, Zeju Qiu, Jiale Kang, Yandong Wen, Bernhard Schölkopf, Weiyang Liu · PDF
  50. Pitfalls of Unlabeled Disagreement-Based Drift Detection in Streaming Tree Ensembles

    Lara Sá Neves, Afonso Lourenço, Lizy Kurian John, Goreti Marreiros · PDF
  51. Prior Distribution and Model Confidence

    Maksim Kazanskii, Artem Kasianov · PDF
  52. Prompt-Level Drift as an Operational Monitoring Problem: Schema Failure Cliffs and Judge-Version Risk in Artifact-Grounded Evaluation

    Yuchen Zhu · PDF
  53. Q-Sched: Pushing the Boundaries of Few-Step Diffusion Models with Quantization-Aware Scheduling

    Natalia Frumkin, Diana Marculescu · PDF
  54. QueST: Persistent Queries as Semantic Monitors for Drift Suppression in Long-Horizon Tracking

    Mayank Anand, Mohammad Saqlain, KyanMahajan, Priya Shukla, Andrew Melnik, Gora Chand Nandi · PDF
  55. RDUMB++: DRIFT-AWARE CONTINUAL TEST-TIME ADAPTATION

    Himanshu Mishra · PDF
  56. Reasoning Is Not Free: Robust Adaptive Cost-Efficient Router for LLM-as-a-Judge

    Wenbo Zhang, Lijinghua Zhang, Liner Xiang, Hengrui Cai · PDF
  57. Reliability-Aware Environment Discovery: Leveraging Feature Entanglement for Subpopulation Robustness

    Harim Lee, Dong-Kyu Chae · PDF
  58. Rethinking Layer Relevance in Large Language Models Beyond Cosine Similarity

    Cristian Hinostroza, Rodrigo Toro Icarte, Christ Devia, Andres Carvallo De Ferari, Eugenio Herrera-Berg, Denis Parra, Jorge F Silva · PDF
  59. Right Regions, Wrong Labels: Semantic Label Flips in Segmentation under Correlation Shift

    Akshit Achara, Yovin Ransika Yahathugoda, Nick Byrne, Michela Antonelli, Esther Puyol Anton, Alexander Hammers, Andrew P. King · PDF
  60. Risk-Averse Learning with Nonstationary Distribution

    Siyi Wang, Zifan Wang, Xinlei Yi, Michael M. Zavlanos, Karl Henrik Johansson, Sandra Hirche · PDF
  61. Robust LLM Performance Certification via Constrained Maximum Likelihood Estimation

    Minghe Shen, Ananth Balashankar, Adam Fisch, David Madras, Miguel R. D. Rodrigues · PDF
  62. Structured Event Logging for Tracking Model Behavior Under Distributional Drift

    Amrutha Muralidhar, Yathindra Lakkanna · PDF
  63. SymTorch: A Framework for Symbolic Distillation of Deep Neural Networks

    Elizabeth S.Z. Tan, Adil Soubki, Miles Cranmer · PDF
  64. TamperBench: A Systematic Framework to Stress-Test LLM Safety Under Fine-Tuning and Tampering

    Saad Hossain, Tom Tseng, Punya Syon Pandey, Samanvay Vajpayee, Matthew Kowal, Nayeema Nonta, Samuel Simko, Stephen Casper, Zhijing Jin, Kellin Pelrine, Sirisha Rambhatla · PDF
  65. TamperTest: A Framework for Testing Tamper Resistance in Open-Weight LLMs

    Isabel Dahlgren, Aashiq Muhamed · PDF
  66. Test-Time Adaptation for Event Prediction via Lightweight Adapters

    Shivam Grover, Hossein Hajimirsadeghi, Zhitian Zhang, Edward J. Smith, Alexander Pashevich · PDF
  67. The Magic Correlations: Understanding Knowledge Transfer from Pretraining to Supervised Fine-Tuning

    Simin Fan, Dimitris Paparas, Natasha Noy, Binbin Xiong, Noveen Sachdeva, Berivan Isik · PDF
  68. TRUST: Trajectory-guided State-Space Temporal Test-Time Adaptation

    Fardad Dadboud, Hamid Azad, Miodrag Bolic, Iraj Mantegh · PDF
  69. Understanding Reasoning Collapse in Multi-Turn Agent Reinforcement Learning

    Zihan Wang, Chi Gui, Xing Jin, Qineng Wang, Licheng Liu, Kangrui Wang, Shiqi Chen, Linjie Li, Zhengyuan Yang, Pingyue Zhang, Yiping Lu, Jiajun Wu, Li Fei-Fei, Lijuan Wang, Yejin Choi, Manling Li · PDF
  70. Value Drifts: Tracing Value Alignment During LLM Post-Training

    Mehar Bhatia, Shravan Nayak, Gaurav Kamath, Marius Mosbach, Karolina Stanczak, Vered Shwartz, Siva Reddy · PDF
  71. Weighted Partial Optimal Transport for Multi-Source Partial Domain Adaptation

    Jayadev Naram, Ziming Wang, Rebecka Jörnsten, Giuseppe Durisi · PDF
  72. WHEN DRIFT DETECTORS CRY WOLF: FALSE ALARM RATES IN CONTINUOUS ML MONITORING

    Raj Shekhar Singh · PDF
  73. When Sensors Fail: Temporal Sequence Models for Robust PPO under Sensor Drift

    Kevin Vogt-Lowell, Theodoros Tsiligkaridis, Rodney Lafuente-Mercado, Shanghua Gao, Surabhi Ghatti, Marinka Zitnik, Daniela Rus · PDF
  74. White-Box Monitoring for Personality Mirroring in Conversational AI

    Eitan Sprejer, Agustín E. Martínez-Suñé, Bruno Bianchi · PDF