NeurIPS 2025 Past Other

NeurIPS 2025 Workshop: Reliable ML from Unreliable Data

NeurIPS 2025 - Reliable ML Workshop

Submission deadline
Aug 30, 2025, 11:59 UTC
imported from OpenReview — check the website for extensions
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (149)

Fetched from OpenReview (v2) on 2026-06-10.

  1. $\texttt{strategic-fl-sim}$: An Extensible Package for Simulating Strategic Behavior in Federated Learning

    Dimitar A. Chakarov, Nikola Konstantinov · PDF
  2. A Few Bad Neurons: Isolating and Surgically Correcting Sycophancy

    Claire O'Brien, Jessica Seto, Dristi Roy, Aditya Dwivedi, Sunishchal Dev, Kevin Zhu, Sean O'Brien, Ryan Lagasse · PDF
  3. A Guide to Robust Generalization: The Impact of Architecture, Pre-training, and Optimization Strategy

    Maxime Heuillet, Rishika Bhagwatkar, Jonas Ngnawe, Yann Pequignot, Alexandre Larouche, Christian Gagné, Irina Rish, Ola Ahmad, Audrey Durand · PDF
  4. A Multi-Method Interpretability Framework for Probing Cognitive Processing in Deep Neural Networks across Vision and Biomedical Domains

    Harshini Suresha, Kavitha S H · PDF
  5. Active Slice Discovery in Large Language Models

    Minhui Zhang, Prahar Injer, Yoav Wald, Elliot Creager · PDF
  6. Adaptive Norm Selection Prevents Catastrophic Overfitting in Fast Adversarial Training

    Fares B. Mehouachi, Saif Jabari · PDF
  7. Adversarial Attacks against Context-dependent Visual Association in Referring Multi-Object Tracking Systems

    Halima Bouzidi, Haoyu Liu, Mohammad Al Faruque · PDF
  8. Adversarially-robust probes for Deep Networks

    Simran Ketha, Nuthan Mummani, Niranjan Rajesh, Venkatakrishnan Ramaswamy · PDF
  9. Aggregated Individual Reporting for Post-Deployment Evaluation: Mechanism Design & Modeling Considerations

    Jessica Dai, Nika Haghtalab, Jamie Heather Morgenstern · PDF
  10. Ambient Diffusion Omni

    Giannis Daras, Adrian Rodriguez-Munoz, Adam Klivans, Antonio Torralba, Constantinos Costis Daskalakis · PDF
  11. Ambient Proteins: Training Diffusion Models on Low Quality Structures

    Giannis Daras, Jeffrey Ouyang-Zhang, Krithika Ravishankar, William Daspit, Constantinos Costis Daskalakis, qiang liu, Adam Klivans, Daniel Jesus Diaz · PDF
  12. An Analysis of Causal Effect Estimation using Outcome Invariant Data Augmentation

    Uzair Akbar, Niki Kilbertus, Hao Shen, Krikamol Muandet, Bo Dai · PDF
  13. Approximate Leave-One-Out Cross Validation for Robust Scatter Matrix Estimation

    Karim Abou-Moustafa · PDF
  14. Approximating Human Preferences Using a Multi-Judge Learned System

    Fernando Avalos, Eitan Sprejer, Augusto Mariano Bernardi, José Pedro Brito de Azevedo Faustino, Jacob Haimes, Narmeen Fatimah Oozeer · PDF
  15. AsFT: Anchoring Safety During LLM Fine-Tuning Within Narrow Safety Basin

    Shuo Yang, Qihui Zhang, Yuyang Liu, Yue Huang, Xiaojun Jia, Kun-Peng Ning, Jia-Yu Yao, jigang wang, Dai Hailiang, Yibing Song, Li Yuan · PDF
  16. Automated Generation of Multilingual Jailbreak Prompts

    Jonathan Ding, Will Cai, Khanak Jain, Dhruv Nair, Aditya Naha, Kevin Zhu, Vasu Sharma · PDF
  17. Batch-Adaptive Annotations for Causal Inference with Complex-Embedded Outcomes

    Ezinne Nwankwo, Lauri Goldkind, Angela Zhou · PDF
  18. Bayesian Decision Making around Experts

    Daniel Jarne Ornia, Joel Dyer, Nicholas George Bishop, Anisoara Calinescu, Michael J. Wooldridge · PDF
  19. Better Data for Satellite Super Resolution

    Miguel Castells, Jules Salzinger, Oliver Zendel · PDF
  20. Beyond Per-Question Privacy: Multi-Query Differential Privacy for RAG Systems

    Ruihan Wu, Erchi Wang, Yu-Xiang Wang · PDF
  21. Beyond Static Bias: Quantifying Fairness Variability in CheXpert

    Ines Ayed, Gabriel Moyà Alcover, Fernando Alonso-Fernandez, Antoni Jaume-i-Capó · PDF
  22. Beyond Text: Multimodal Jailbreaking of Vision-Language and Audio Models through Perceptually-Aware Transformations

    Divyanshu Kumar, Shreyas Jena, Nitin Aravind Birur, Tanay Baswa, Sahil Agarwal, Prashanth Harshangi · PDF
  23. Breaking Bad: Exploring the Dangers of LLM-generated Misinformation from Fringe Social Media

    Han Kyul Kim, Hansea Kim, Eunjeong Joo, Andy Skumanich · PDF
  24. Breaking the Mirror: Activation-Based Mitigation of Self-Preference in LLM Evaluators

    Dani Roytburg, Matthew Nguyen, Matthew Bozoukov, Hongyu Fu, Jou Barzdukas, Narmeen Fatimah Oozeer · PDF
  25. BridgePure: Limited Protection Leakage Can Break Black-Box Data Protection

    Yihan Wang, Yiwei Lu, Xiao-Shan Gao, Gautam Kamath, Yaoliang Yu · PDF
  26. Certified Adversarial Robustness via Mixture-of-Gaussians Randomized Smoothing

    Vaughn Rostermundt, Brendon G. Anderson · PDF
  27. Check Yourself Before You Wreck Yourself: Selectively Quitting Improves LLM Agent Safety

    Vamshi Krishna Bonagiri, Ponnurangam Kumaraguru, Khanh Xuan Nguyen, Benjamin Plaut · PDF
  28. Clean-Label Physical Backdoor Attacks with Data Distillation

    Thinh Dao, Khoa D Doan, Kok-Seng Wong · PDF
  29. COIR: Chain-of-Intention Reasoning Elicits Defense in Multimodal Large Language Models

    Gyuwon Choi, Donggon Jang, Daeshik Kim · PDF
  30. Complementing Self-Consistency with Cross-Model Disagreement for Uncertainty Quantification

    Kimia Hamidieh, Veronika Thost, Walter Gerych, Mikhail Yurochkin, Marzyeh Ghassemi · PDF
  31. Concept-Based Masking: A Patch-Agnostic Defense Against Adversarial Patch Attacks

    Ayushi Mehrotra, Derek Peng, Dipkamal Bhusal, Nidhi Rastogi · PDF
  32. Conformal Prediction for Molecular Properties under Label Shift

    Hyeonsu Lee, Juyeon Kim, Erkhembayar Jadamba, Seungjin Choi, Hyunjin Shin · PDF
  33. Corruption-Tolerant Asynchronous Q-Learning with Near-Optimal Rates

    Sreejeet Maity, Aritra Mitra · PDF
  34. Cost Efficient Fairness Audit Under Partial Feedback

    Nirjhar Das, Mohit Sharma, Praharsh Nanavati, Kirankumar Shiragur, Amit Deshpande · PDF
  35. CroPA++: Exposing Vulnerabilities in Vision Language Models and Enhancing Adversarial Transferability of Cross-Prompt Attacks

    Agam Pandey, Amritanshu Tiwari, Atharv Mittal, Sukrit Jindal, Swadesh Swain · PDF
  36. Cross-Lingual Multimodal Retrieval-Augmented Generation for Open Question Answering in Tamil and Yoruba

    Kiran Raja, Mobareji Abejide, Arya Ram, Utkarsh Sharma, Benjamin Liu, Kevin Zhu · PDF
  37. Curvature Tuning: Provable Training-free Model Steering From a Single Parameter

    Leyang Hu, Matteo Gamba, Randall Balestriero · PDF
  38. Data Decomposition beyond Splitting for Causal Estimation

    Xuelin Yang, Dhruv Singal, Rina Friedberg, Michael I. Jordan, Niloy Biswas · PDF
  39. Data-Efficient and Robust Coreset Selection via Sparse Adversarial Perturbations

    Tushar Shinde, Manasa Madabhushi · PDF
  40. Deep Research Brings Deeper Harm

    Shuo Chen, Zonggen Li, Zhen Han, Bailan He, Tong Liu, Haokun Chen, Georg Groh, Philip Torr, Volker Tresp, Jindong Gu · PDF
  41. Diffusion-supplemented Implicit Layers: Operator Smoothing for better Implicit Solvers

    Dinislam Gabitov, Bader Rasheed, Anastasia Antsiferova, Dmitriy S. Vatolin · PDF
  42. Disarming Strategic Text: Span-Aware Counterfactuals for Robust Content Moderation

    Hardik Meisheri, Muhammad Zaid Hassan, Swati Tiwari, Puneet Mangla, Samarth Bharadwaj, Karthik Sankaranarayanan, Amit S · PDF
  43. Ditch the Denoiser: Emergence of Noise Robustness in Self-Supervised Learning from Data Curriculum

    Wenquan Lu, Jiaqi Zhang, Hugues Van Assel, Randall Balestriero · PDF
  44. Do Internal Layers of LLMs Reveal Patterns for Jailbreak Detection?

    Sri Durga Sai Sowmya Kadali, Evangelos E. Papalexakis · PDF
  45. Domain Generalization: A Tale of Two ERMs

    Yilun Zhu, Naihao Deng, Naichen Shi, Aditya Gangrade, Clayton Scott · PDF
  46. Don’t Make It Up: Preserving Ignorance Awareness in LLM Fine-Tuning

    William F. Shen, Xinchi Qiu, Nicola Cancedda, Nicholas D. Lane · PDF
  47. Double Machine Learning Evaluation Under Distribution Shift and Selection Bias

    Annie S Ulichney, Amanda Lee Coston · PDF
  48. Drawing Reliable Conclusions with Imperfect Synthetic Data

    Yewon Byun, Shantanu Gupta, Zachary Chase Lipton, Rachel Leah Childers, Bryan Wilder · PDF
  49. DynamiX: Dynamic Resource eXploration for Personalized Ad-Recommendations

    Adam Holeman, Sohini Roychowdhury, Mohammad Amin, Feng Wei, Bhaskar Mehta, Sri Reddy · PDF
  50. Efficiently Robust In-Context Reinforcement Learning with Adversarial Generalization and Adaptation

    Juncheng Dong, Hao-Lun Hsu, Miroslav Pajic, Vahid Tarokh · PDF
  51. Energy-Shaped Manifold Projections Enable Adversarial Detection

    Artem Matevosian, Bader Rasheed · PDF
  52. ERGO: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models

    Haziq Mohammad Khalid, Athikash Jeyaganthan, Timothy Do, Yicheng Fu, Vasu Sharma, Sean O'Brien, Kevin Zhu · PDF
  53. Evaluating robustness of tabular models under meta-features based shifts

    Irina Deeva, Nargiza Amerkhanova, Alena Kropacheva · PDF
  54. Evaluating the Quality of AI-Generated Resolutions from Conversational vs Structured Sources: Implications for Enterprise Knowledge Automation

    Archan Dutta, Vinay Raj Sisodiya, Hardik Airen, Phani Nivarthi · PDF
  55. Extracting Latent Generalization from Models Trained with Noisy Labels

    Simran Ketha, Venkatakrishnan Ramaswamy · PDF
  56. Failure Prediction Is a Better Performance Proxy for Early-Exit Networks Than Calibration

    Piotr Kubaty, Filip Szatkowski, Metod Jazbec, Bartosz Wójcik · PDF
  57. FairContrast: Enhancing Fairness through Contrastive learning and Customized Augmenting Methods on Tabular Data

    Aida Tayebi, Ali Khodabandeh Yalabadi, Mehdi Yazdani-Jahromi, Ozlem Garibay · PDF
  58. Fairness Implications of GNN-to-MLP Knowledge Distillation

    Margaret Capetz, Yizhou Sun, Arjun Subramonian · PDF
  59. Fairness Through Independence via Cramér-von Mises Regularization

    Albert Gimó Contreras, Mariia Vladimirova, Federico Pavone, Reda CHHAIBI · PDF
  60. False Sense of Security: Why Probing-based Malicious Input Detection Fails to Generalize

    Cheng Wang, Zeming Wei, Qin Liu, Wenxuan Zhou, Muhao Chen · PDF
  61. FAVAE-Effective Frequency Aware Latent Tokenizer

    Tejaswini Medi, Hsien-Yi Wang, Arianna Rampini, Margret Keuper · PDF
  62. Few-Shot Knowledge Distillation for Language Models via Counterfactual Explanations

    Faisal Hamman, Pasan Dissanayake, Yanjun Fu, Sanghamitra Dutta · PDF
  63. Fine-Grained Uncertainty Decomposition in Large Language Models: A Spectral Approach

    Nassim Walha, Sebastian G. Gruber, Thomas Decker, Yinchong Yang, Alireza Javanmardi, Eyke Hüllermeier, Florian Buettner · PDF
  64. Forget Less, Retain More: A Lightweight Regularizer for Rehearsal-Based Continual Learning

    Lama Alssum, Hasan Abed Al Kader Hammoud, Motasem Alfarra, Juan C Leon Alcazar, Bernard Ghanem · PDF
  65. From Clutter to Clarity: Visual Recognition through Foveated Object-Centric Learning (FocL)

    Amitangshu Mukherjee, Deepak Ravikumar, Kaushik Roy · PDF
  66. From Evidence to Knowledge: A Hierarchical Probabilistic Model of the Scientific Knowledge Landscape at Web Scale

    Yaniv Slor Futterman, Binyamin Perets, Mark Kozdoba, Shie Mannor · PDF
  67. From Many Voices to One: A Statistically Principled Aggregation of LLM Judges

    Jitian Zhao, Changho Shin, Tzu-Heng Huang, Satya Sai Srinath Namburi GNVV, Frederic Sala · PDF
  68. From Search to Decision: A Framework for Adversarially Robust Approximate Nearest Neighbor Search

    Alexandr Andoni, Themistoklis Haris, Esty Kelman, Krzysztof Onak · PDF
  69. From Semantics to Symbols: A Two-Stage Framework for Deconstructing LLM Reasoning into Concepts and Rules

    Yanchen Yin · PDF
  70. Generalizing Robustness from $\ell_p$ to Unforeseen Attack via Calibrated Adversarial Sampling

    Rui Wang, Zeming Wei, Xiyue Zhang, Meng Sun · PDF
  71. GUARD: Guiding Unbiased Alignment through Reward Debiasing

    Advay Samnerkar, Doelle Bhattacharya, Kailash Ranganathan, Kevin Zhu, Ashwinee Panda · PDF
  72. Human Uncertainty-Aware Reliable Data Selection and Efficient Annotation for Visual Question Answering

    Jian Lan, Zhicheng Liu, Thomas Seidl · PDF
  73. Improving Consistency in Retrieval-Augmented Systems with Group Similarity Rewards

    Faisal Hamman, Chenyang Zhu, Anoop Kumar, Xujun Peng, Sanghamitra Dutta, Daben Liu, Alfy Samuel · PDF
  74. Inducing Uncertainty on Open-Weight Models for Test-Time Privacy in Image Recognition

    Muhammad H. Ashiq, Peter Triantafillou, Hung Yun Tseng, Grigorios Chrysos · PDF
  75. Influence Functions for Preference Dataset Pruning

    Daniel Fein, Gabriela Aránguiz Dias · PDF
  76. Information-Theoretic Conditions for Chain-of-Thought Monitorability and Methods for Improving It

    Usman Anwar, Tim Bakker, Cristina Pinneri, Dana Kianfar, Christos Louizos · PDF
  77. Is Safety Standard Same for Everyone? User-Specific Safety Evaluation of Large Language Models

    Yeonjun In, Wonjoong Kim, Kanghoon Yoon, Sungchul Kim, Mehrab Tanjim, Sangwu Park, Kibum Kim, Chanyoung Park · PDF
  78. It is Hard to Unlearn Dogged Backdoor Samples in Diffusion Models

    An Huang, Zuobin Xiong, Muchao Ye, Junggab Son · PDF
  79. KAIROS: Scalable Model-Agnostic Data Valuation

    Jiongli Zhu, Parjanya Prajakta Prashant, Alex Cloninger, Babak Salimi · PDF
  80. Keep It Real: Challenges in Attacking Compression-Based Adversarial Purification

    Samuel Räber, Till Aczel, Andreas Plesner, Roger Wattenhofer · PDF
  81. Learning reliably under adversarial attacks, distribution shifts and strategic agents

    Maria Florina Balcan, Dravyansh Sharma · PDF
  82. Lightweight Robust Direct Preference Optimization

    Cheol Woo Kim, Shresth Verma, Mauricio Tec, Milind Tambe · PDF
  83. LoCaTE: A Local and Training Dynamics Perspective at Detecting Label Noise in Deep Classification

    A. Anas Chentouf, Haoran Zhang, Marzyeh Ghassemi · PDF
  84. Locks Tested Without Burglars: Using Coding Assistants to Break Prompt Injection Defenses

    Atharv Singh Patlan, Pramod Viswanath, Prateek Mittal · PDF
  85. Minimal Repairs for Learning Over Incomplete Data

    Cheng Zhen, Nischal Aryal, Arash Termehchy, Prayoga, Garrett Biwer · PDF
  86. MPSelectTune: Prompt-type Selection for Fine-tuning improves Concept Unlearning in LLMs

    Shubhadip Nag, Srinjoy Das, Agniva Saha, Anushree Ghosh, Soumi Das, Tarun Kumar, Suparna Bhattacharya, Sourangshu Bhattacharya · PDF
  87. Near-Optimal Reinforcement Learning for Linear Distributionally Robust Markov Decision Processes

    Zhishuai Liu, Weixin Wang, Pan Xu · PDF
  88. Not All Samples Are Equal: Quantifying Instance-level Difficulty in Targeted Data Poisoning

    William Xu, Yiwei Lu, Yihan Wang, Matthew Y. R. Yang, Zuoqiu Liu, Gautam Kamath, Yaoliang Yu · PDF
  89. Not All Splits Are Equal: Rethinking Attribute Generalization Across Unrelated Categories

    Firca Liviu Nicolae, Elena Burceanu, Antonio Barbalau, Dan Oneata · PDF
  90. Obscurable Fishermen

    Ekaterina Fedorova, Chara Podimata, Constantinos Costis Daskalakis · PDF
  91. On Fairness of Task Arithmetic: The Role of Task Vectors

    Laura Gomezjurado Gonzalez, Hiroki Naganuma, Kotaro Yoshida, Takafumi Horie, Yuji Naraki, Ryotaro Shimizu · PDF
  92. On the Interaction of Compressibility and Adversarial Robustness

    Melih Barsbey, Antônio H. Ribeiro, Umut Simsekli, Tolga Birdal · PDF
  93. Optimal Fair Learning Robust to Adversarial Distribution Shift

    Sushant Agarwal, Amit Deshpande, Rajmohan Rajaraman, Ravi Sundaram · PDF
  94. Optimal Lower Bounds and New Upper Bounds for Sequential Prediction with Abstention

    Ezra Edelman, Surbhi Goel · PDF
  95. Persistent and Stealthy Backdoor Attacks in Federated Learning via Layerwise Model Poisoning

    Nader Bouacida, Jayneel Vora, Prasant Mohapatra · PDF
  96. Positive-Unlabeled Learning for Control Group Construction in Observational Causal Inference

    Ilias Tsoumas, Dimitrios Bormpoudakis, Vasileios Sitokonstantinou, Athanasios Askitopoulos, Andreas Kalogeras, Charalampos (Haris) Kontoes, Ioannis N. Athanasiadis · PDF
  97. Probabilistic Framework for Robustness of Counterfactual Explanations Under Data Shifts

    Xuan Zhao, Lena Krieger, Zhuo Cao, Arya Bangun, Hanno Scharr, Ira Assent · PDF
  98. Quantifying CBRN Risk in Frontier Models

    Divyanshu Kumar, Nitin Aravind Birur, Tanay Baswa, Sahil Agarwal, Prashanth Harshangi · PDF
  99. Reasoning as an Adaptive Defense for Safety

    Taeyoun Kim, Fahim Tajwar, Aditi Raghunathan, Aviral Kumar · PDF
  100. Regression-Based Estimation of Causal Effects in the Presence of Selection Bias and Confounding

    Marlies Hafer, Alexander Marx · PDF
  101. Regularized Robustly Reliable Learners and Instance Targeted Attacks

    Avrim Blum, Donya Saless · PDF
  102. Reliable Active Learning from Unreliable Labels via Neural Collapse Geometry

    Atharv Goel, Sharat Agarwal, Saket Anand, Chetan Arora · PDF
  103. Reliable Compositional Editing with Overlap-Aware Attention in Diffusion Models

    Salamata Konate, Hassan Hamidi, Elham Dolatabadi, Frank Rudzicz, Laleh Seyyed-Kalantari · PDF
  104. Reliable Models via Responsiveness Verification

    Meredith Stewart, Seung Hyun Cheon, Bogdan Kulynych, Tsui-Wei Weng, Berk Ustun · PDF
  105. Reliable Unlearning Harmful Information in LLMs with Metamorphosis Representation Projection

    Chengcan Wu, Zeming Wei, Huanran Chen, Yinpeng Dong, Meng Sun · PDF
  106. Responsible Imputation of User Behavior Surveys via Mask-Aware Transformers

    Aman Shukla, Rishabh Kumar, Daniel Patrick Scantlebury · PDF
  107. Rethinking Sparse Autoencoders: Select-and-Project for Fairness and Control from Encoder Features Alone

    Antonio Barbalau, Cristian Daniel Paduraru, Teodor Poncu, Alexandru Tifrea, Elena Burceanu · PDF
  108. Return Augmented Decision Transformer for Off-Dynamics Reinforcement Learning

    Ruhan Wang, Yu Yang, Zhishuai Liu, Dongruo Zhou, Pan Xu · PDF
  109. Reweighted Flow Matching via Unbalanced Optimal Transport for Long-tailed Generation

    Hyunsoo Song, Minjung Gim, Jaewoong Choi · PDF
  110. RL-Guided Data Selection for Language Model Finetuning

    Animesh Jha, Ananjan Nandi, Harshit Gupta · PDF
  111. Robust Adversarial Reinforcement Learning in Stochastic Games via Sequence Modeling

    Xiaohang Tang, Zhuowen Cheng, Satyabrat Kumar · PDF
  112. Robust Federated Learning under Heterogeneous Data with Generalized Heavy-Ball Momentum

    Riccardo Zaccone, Sai Praneeth Karimireddy, Carlo Masone, Marco Ciccone · PDF
  113. Robust Fine-Tuning from Non-Robust Pretrained Models: Mitigating Suboptimal Transfer With Epsilon-Scheduling

    Jonas Ngnawe, Maxime Heuillet, Sabyasachi Sahoo, Yann Pequignot, Frederic Precioso, Christian Gagné · PDF
  114. Robust Multi-task Modeling for Bayesian Optimization via In-Context Learning

    Yucen Lily Li, Samuel Daulton, Samuel Müller, Andrew Gordon Wilson, Eytan Bakshy · PDF
  115. Safety by Design: High-Probability Constrained Contextual Bandits

    Spyros Dragazis, Aldo Pacchiano · PDF
  116. SAGE: Streaming, Agreement-driven Gradient Sketches for Representative Subset Selection

    Ashish Jha, Salman Ahmadi-Asl · PDF
  117. Sandbagging in a Simple Survival Bandit Problem

    Joel Dyer, Daniel Jarne Ornia, Nicholas George Bishop, Anisoara Calinescu, Michael J. Wooldridge · PDF
  118. Selective Cost-Aware Random Forests for Unreliable Data

    Sarwesh Rauniyar · PDF
  119. Selective Preference Aggregation

    Shreyas Kadekodi, Hayden McTavish, Berk Ustun · PDF
  120. SIVA: Self-Improving Vulnerability Agent

    Valentin Walischewski, Giulio Zizzo, Kevin N. Webster · PDF
  121. Sparse Parameter Adaptation for Fair Model Transfer Across Domains

    Sina Baharlouei, Minoo Ahmadi · PDF
  122. Spectral Regularization as a Safety-Critical Inductive Bias

    Shivam Dubey · PDF
  123. StealthEval: A Probe-Rewrite-Evaluate Workflow for Reliable Benchmarks

    Lang Xiong, Nishant Bhargava, Jeremy Chang, Jianhang Hong, Haihao Liu, Kevin Zhu · PDF
  124. Strategic Feature Selection

    Jivat Neet Kaur, Divya M Shanmugam, Emma Pierson, Michael I. Jordan, Nika Haghtalab, Ahmed Alaa, Serena Lutong Wang · PDF
  125. Stress-Testing Byzantine Defenses under Data Heterogeneity

    Latifa Errami, Hajar El Hammouti, El houcine Bergou · PDF
  126. Stylistic Shifts in Human–LLM Conversations: Challenges and Adaptation

    Fulei Zhang, Zhou Yu · PDF
  127. Tackling the Noisy Elephant in the Room: Label Noise-robust Out-of-Distribution Detection via Loss Correction and Low-rank Decomposition

    Tarhib Al Azad, Shahana Ibrahim · PDF
  128. Taming the Noisy Oracle: Robust Entity-Centric Question Answering via Learning from Imperfect Feedback

    Binyamin Perets, Zohar Shnaider, Dvir Aran, Shie Mannor · PDF
  129. Task Priors: Enhancing Model Evaluation by Considering the Entire Space of Downstream Tasks

    Niket Patel, Randall Balestriero · PDF
  130. Teaming LLMs to Detect and Mitigate Hallucinations

    Demian Till, John Gordon Smeaton, Peter Haubrick, Mohammed Gouse Subhan Saheb, Florian Graef, David Berman · PDF
  131. Temp-SCONE: A Novel Out-of-Distribution Detection and Domain Generalization Framework for Wild Data with Temporal Shift

    Aditi Naiknaware, Sanchit Singh, Hajar Homayouni, Salimeh Sekeh · PDF
  132. Testing Noise Assumptions of Learning Algorithms

    Surbhi Goel, Adam Klivans, Konstantinos Stavropoulos, Arsen Vasilyan · PDF
  133. Text‑Guided Data Attribution: Attributing the Influence of Simplicity Bias to Dataset

    Kumar Shubham, Pranav Sastry, Prathosh AP · PDF
  134. The Impact of Training Data on Adversarial Robustness

    Marco Zimmerli, Andreas Plesner, Till Aczel, Roger Wattenhofer · PDF
  135. The Silent Judge: Unacknowledged Shortcut Bias in LLM-as-a-Judge

    Arash Marioriyad, Mohammad Hossein Rohban, Mahdieh Soleymani Baghshah · PDF
  136. The Statistical Fairness-Accuracy Frontier

    Alireza Fallah, Michael I. Jordan, Annie S Ulichney · PDF
  137. Towards Context-Aware Domain Generalization: Understanding the Benefits and Limits of Marginal Transfer Learning

    Jens Müller, Lars Kühmichel, Martin Rohbeck, Stefan T. Radev, Ullrich Koethe · PDF
  138. Towards Trustworthy Amortized Bayesian Model Comparison

    Šimon Kucharský, Aayush Mishra, Daniel Habermann, Stefan T. Radev, Paul-Christian Bürkner · PDF
  139. Trust, But Attribute: Tracing Impact of Data on Trustworthiness in Supervised LLM Fine-Tuning

    Kumar Shubham, Nishant Sharma, Karn Tiwari, Prathosh AP · PDF
  140. Uncertainty as Feature Gaps: Epistemic Uncertainty Quantification of LLMs in Contextual Question-Answering

    Yavuz Faruk Bakman, Zhiqi Huang, Chenyang Zhu, Anoop Kumar, Alfy Samuel, Daben Liu · PDF
  141. Uncertainty-Aware LLMs Fail to Flag Misleading Contexts

    Tianyi Zhou, Johanne Medina, Sanjay Chawla · PDF
  142. Unlocking Transfer Learning for Open-World Few-Shot Recognition

    Byeonggeun Kim, Juntae Lee, Kyuhong Shim, Simyung Chang · PDF
  143. Unspoken Hints: Accuracy Without Acknowledgement in LLM Reasoning

    Arash Marioriyad, Shaygan Adim, Nima Alighardashi, Mahdieh Soleymani Baghshah, Mohammad Hossein Rohban · PDF
  144. WASP: A Weight-Space Approach to Detecting Learned Spuriousness

    Cristian Daniel Paduraru, Antonio Barbalau, Radu Filipescu, Andrei Liviu Nicolicioiu, Elena Burceanu · PDF
  145. Watch the Weights: Unsupervised monitoring and control of fine-tuned LLMs

    Ziqian Zhong, Aditi Raghunathan · PDF
  146. When "Competency" in Reasoning Opens the Door to Vulnerability: Jailbreaking LLMs via Novel Ciphers

    Divij Handa, Zehua Zhang, Amir Saeidi, Shrinidhi Kumbhar, Md Nayem Uddin, Aswin RRV, Chitta Baral · PDF
  147. Why is Your Language Model a Poor Implicit Reward Model?

    Noam Razin, Yong Lin, Jiarui Yao, Sanjeev Arora · PDF
  148. Wrong Model, Right Uncertainty: Spatial Associations for Discrete Data with Misspecification

    David R. Burt, Renato Berlinghieri, Tamara Broderick · PDF
  149. Zero-Shot Robustness of Vision Language Models Via Confidence-Aware Weighting

    Nikoo Naghavian, Mostafa Tavassolipour · PDF