ICLR 2026 Past AgentsSafety & alignment

ICLR 2026 Workshop on Lifelong Agents: Learning, Aligning, Evolving

LLA 2026

Submission deadline
Feb 16, 2026, 23:59 UTC
imported from OpenReview — check the website for extensions
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (104)

Fetched from OpenReview (v2) on 2026-06-10.

  1. AccelOpt: A Self-Improving LLM Agentic System for AI Accelerator Kernel Optimization

    Genghan Zhang, Shaowei Zhu, Anjiang Wei, Zhenyu Song, Allen Nie, Zhen Jia, Nandita Vijaykumar, Yida Wang, Kunle Olukotun · PDF
  2. ACON: Optimizing Context Compression for Long-horizon LLM Agents

    Minki Kang, Wei-Ning Chen, Dongge Han, Huseyin A Inan, Lukas Wutschitz, Yanzhi Chen, Robert Sim, Saravan Rajmohan · PDF
  3. Actor-Curator: A Scalable RL Post-training Framework with Co-adaptive Curricula

    Zhengyao Gu, Jonathan Light, Raul Astudillo, Ziyu Ye, Langzhou He, Wei Cheng, Santiago Paternain, Philip S. Yu, Yisong Yue · PDF
  4. Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning

    Jiaqi Liu, Kaiwen Xiong, Peng Xia, Yiyang Zhou, Haonian Ji, Lu Feng, Siwei Han, Mingyu Ding, Huaxiu Yao · PDF
  5. Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

    Peng Xia, Kaide Zeng, Jiaqi Liu, Can Qin, Fang Wu, Yiyang Zhou, Caiming Xiong, Huaxiu Yao · PDF
  6. AgentGym-RL: An Open-Source Framework to Train LLM Agents for Long-Horizon Decision Making via Multi-Turn RL

    Zhiheng Xi, Jixuan Huang, Chenyang Liao, Baodai Huang, Jiaqi Liu, Honglin Guo, yajie yang, Rui Zheng, Junjie Ye, Jiazheng Zhang, Wenxiang Chen, Wei He, Yiwen Ding, Guanyu Li, Zehui Chen, Zhengyin Du, Xuesong Yao, Yufei Xu, Jiecao Chen, Tao Gui, Zuxuan Wu, Qi Zhang, Xuanjing Huang, Yu-Gang Jiang · PDF
  7. Agentic Cognitive Profiling: Realigning Automated Alzheimer’s Disease Detection with Clinical Construct Validity

    Jiawen Kang, Kun LI, Dongrui Han, Jinchao Li, Junan Li, Lingwei Meng, Xixin Wu, Helen M. Meng · PDF
  8. Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

    Qizheng Zhang, Changran Hu, Shubhangi Upasani, Boyuan Ma, Fenglu Hong, Vamsidhar Kamanuru, Jay Rainton, Chen Wu, Mengmeng Ji, Hanchen Li, Urmish Thakker, James Zou, Kunle Olukotun · PDF
  9. AIF-GEN: Open-Source Platform and Synthetic Dataset Suite for Reinforcement Learning on Large Language Models

    Shahrad Mohammadzadeh, Jacob Chmura, Ivan Anokhin, Jacob-Junqi Tian, Mandana Samiei, Taz Scott-Talib, Irina Rish, Doina Precup, Reihaneh Rabbany, Nishanth Anand · PDF
  10. Aligning LLMs Toward Multi-Turn Conversational Outcomes Using Iterative RLHF

    Daniel Jiang, Ankur Samanta, Yukai Yang, Jalaj Bhandari, Rémi Munos, Tyler Lu · PDF
  11. Alignment Propagation: Spreading Cooperative Behaviors in Multi-Agent Systems through Seed Agents

    Asuka Yuxi Zheng, Nicole Hsing, Yi Zhao, Haoqin Tu, Jen-tse Huang · PDF
  12. AlphaApollo: A System for Deep Agentic Reasoning

    Zhanke Zhou, Chentao Cao, Xiao Feng, Xuan Li, Zongze Li, Xiangyu Lu, Jiangchao Yao, Weikai Huang, Tian Cheng, Jianghangfan Zhang, Tangyu Jiang, Linrui Xu, Yiming Zheng, Brando Miranda, Tongliang Liu, Sanmi Koyejo, Masashi Sugiyama, Bo Han · PDF
  13. Asymmetric Goal Drift in Coding Agents Under Value Conflict

    Magnus Saebo, Spencer Gibson, Tyler Crosse, Achyutha Menon, Eyon Jang, Diogo Cruz · PDF
  14. Benchmarking Continual Agent Memory for Online Learning, Transfer, and Forgetting

    Zihang Ma, Jinyi Liu, Hongyao Tang, Yi Ma, Ruitao Wang, Yifu Yuan, YAN ZHENG, Jianye HAO · PDF
  15. Beyond Reward Maximization: Evaluating the Diversity of Trajectories in Reinforcement Learning with Temporal Vendi Score

    Stanic Tom, Marco Jiralerspong, Zhang Xiaofeng, Danilo Vucetic, Gauthier Gidel · PDF
  16. BEYOND SYNTAX: ACTION SEMANTICS LEARNING FOR APP AGENTS

    Dezhao Luo, Bohan Tang, Jianheng Liu, Jingxuan Chen, Shaogang Gong, Jianye HAO, Jun Wang, Kun Shao · PDF
  17. BioProAgent: Neuro-Symbolic Grounding for Constrained Scientific Planning

    Yuyang Liu, Jingya Wang, Liuzhenghao Lv, Yonghong Tian · PDF
  18. BrowseConf: Confidence-Guided Test-Time Scaling for Web Agents

    Litu Ou, Kuan Li, Huifeng Yin, Liwen Zhang, Zhongwang Zhang, Xixi Wu, Rui Ye, Zile Qiao, Pengjun Xie, Jingren Zhou, Yong Jiang · PDF
  19. Can We Predict Before Executing Machine Learning Agents?

    Jingsheng Zheng, Jintian Zhang, Yujie Luo, Yuren Mao, Yunjun Gao, Lun Du, Huajun Chen, Ningyu Zhang · PDF
  20. CAP: A Scalable Benchmark for Evaluating Cross-Site Browser Agents with Complex Actions and Perception

    XuZejun, Taiyi Chen, Jin Li, Yongtong Gu, QiCheng, Lvaixuan, Zhu shuai, ZhuPengfei, Kaichen Yang, Sun Boyu, YixianYang, Mulong Xie, Xiaoteng Ma, Hongru WANG · PDF
  21. CF-Router: Closed-Form Solution for Expert Selection in Multimodal Agent Lifelong Learning

    Jiaxu Li, Zhijie Zheng, Jianyu Qi, Rongchang Zhao · PDF
  22. CoDaPO: Confidence and Difficulty-Adaptive Policy Optimization for LLM Reasoning

    Zhanke Zhou, Xiangyu Lu, Chentao Cao, Brando Miranda, Tongliang Liu, Bo Han, Sanmi Koyejo · PDF
  23. Cold-Start Personalization via Training-Free Priors from Structured World Models

    Avinandan Bose, Shuyue Stella Li, Faeze Brahman, Pang Wei Koh, Simon Shaolei Du, Yulia Tsvetkov, Maryam Fazel, Lin Xiao, Asli Celikyilmaz · PDF
  24. Constructive Specification for Plug-and-Play Learnware Agents

    Jian-Dong Liu, Zi-Chen Zhao, Hao Sun, Lin-Xing Wu, Huan Zhang, Pengyuan Wang, ZhaoMing, Xinyu Chu, Shu Yan, Yongbei Zhu, Weijun Zhong, Zhi-Hao Tan, SHANG JING, Yang Yu, Zhi-Hua Zhou · PDF
  25. Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live

    Hanchen Li, Qiuyang Mang, Runyuan He, Qizheng Zhang, Huanzhi Mao, Xiaokun Chen, Hangrui Zhou, Alvin Cheung, Joseph E. Gonzalez, Ion Stoica · PDF
  26. CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents

    Xiangru Jian, Shravan Nayak, Kevin Qinghong Lin, Aarash Feizi, Kaixin Li, Patrice Bechard, Spandana Gella, Sai Rajeswar · PDF
  27. DéjàQ: Open-Ended Evolution of Diverse, Learnable and Verifiable Problems

    Willem Röpke, Samuel Coward, Andrei Lupu, Thomas Foster, Tim Rocktäschel, Jakob Nicolaus Foerster · PDF
  28. DETACH: Cross-domain Learning for Long-Horizon Tasks via Mixture of Disentangled Experts

    Yutong Shen, Hangxu Liu, Lei Zhang, Penghui Liu, Ruizhe Xia, Tongtong Feng · PDF
  29. DomusMind: A Benchmark for Evaluating Lifelong Smart Home Agents Under Drift

    Rong Xu, Yinxin Wan, Xiaochan Xue · PDF
  30. DRPG (Decompose, Retrieve, Plan, Generate): An Agentic Framework for Academic Rebuttal

    Peixuan Han, YingJie Yu, Jingjun Xu, Jiaxuan You · PDF
  31. DSGym: A Standardized and Holistic Framework for Advancing Data Science Agents

    Fan Nie, Junlin Wang, Harper Hua, Federico Bianchi, Yongchan Kwon, Zhenting Qi, Owen Queen, Shang Zhu, James Zou · PDF
  32. Efficient Tree-Structured Deep Research with Adaptive Resource Allocation

    Lunyiu Nie, Nedim Lipka, Ryan A. Rossi, Swarat Chaudhuri · PDF
  33. Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks

    Wenqi Zhang, Mengna Wang, Gangao Liu, Huixin Xu, Yiwei Jiang, Yongliang Shen, Guiyang Hou, Zhe Zheng, Hang Zhang, Xin Li, Jiajun Liu, Weiming Lu, Peng Li, Yueting Zhuang · PDF
  34. ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction

    Qineng Wang, Wenlong Huang, Yu Zhou, Hang Yin, Tianwei Bao, Jianwen Lyu, Weiyu Liu, Ruohan Zhang, Jiajun Wu, Li Fei-Fei, Manling Li · PDF
  35. EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings

    Shiva Krishna Reddy Malay, Shravan Nayak, Jishnu Sethumadhavan Nair, Sagar Davasam, Aman Tiwari, Sathwik Tejaswi Madhusudhan, Sridhar Krishna Nemala, Srinivas Sunkara, Sai Rajeswar · PDF
  36. EvoTac: A Self-Evolving LLM Agent for Eliciting Reusable Tacit Negotiation Heuristics from Terminal Outcomes

    Runjie Shen, Zhilong Li, Bingzhe Wu · PDF
  37. ExecTune: Effective Steering of Black-Box LLMs with Guide Models

    Vijay Lingam, Aditya Golatkar, Anwesan Pal, Ben Vo, Narayanan Sadagopan, Alessandro Achille, Jun Huan, Anoop Deoras, Stefano Soatto · PDF
  38. Expanding the Capabilities of Reinforcement Learning via Text Feedback

    Yuda Song, Lili Chen, Fahim Tajwar, Rémi Munos, Deepak Pathak, Drew Bagnell, Aarti Singh, Andrea Zanette · PDF
  39. Experiential Reinforcement Learning

    Taiwei Shi, Sihao Chen, Bowen Jiang, Linxin Song, Longqi Yang, Jieyu Zhao · PDF
  40. Federated Agent Reinforcement Learning

    Canyu Chen, Kangyu Zhu, Zhaorun Chen, Zhanhui Zhou, Shizhe Diao, Yiping Lu, Tian Li, Manling Li, Dawn Song · PDF
  41. FocusAgent: Simple Yet Effective Ways Of Trimming The Large Context of Web Agents

    Imene Kerboua, Sahar Omidi Shayegan, Xing Han Lù, Léo Boisvert, Megh Thakkar, Massimo Caccia, Jérémy Espinas, Alex Aussem, Véronique Eglin, Alexandre Lacoste · PDF
  42. From Word to World: Can Large Language Models be Implicit Text-based World Models?

    Yixia Li, Hongru WANG, Jiahao Qiu, Zhenfei Yin, Dongdong Zhang, Cheng Qian, Zeping Li, Xiaoteng Ma, Guanhua Chen, Heng Ji · PDF
  43. GASP: Guided Asymmetric Self-Play For Coding LLMs

    Swadesh Jana, Cansu Sancaktar, Tomáš Daniš, Georg Martius, Antonio Orvieto, Pavel Kolev · PDF
  44. GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators

    Jiacheng Guo, Ling Yang, Peter Chen, Qixin Xiao, Yinjie Wang, Xinzhe Juan, Jiahao Qiu, Ke Shen, Mengdi Wang · PDF
  45. Generative Control as Optimization: Time Unconditional Flow Matching for Adaptive and Robust Robotic Control

    Zunzhe Zhang, Runhan Huang, Yicheng Liu, Shaoting Zhu, Linzhan Mou, Hang Zhao · PDF
  46. Group-Evolving Agents: Open-Ended Self-Improvement via Experience Sharing

    Zhaotian Weng, Antonis Antoniades, Deepak Nathani, Zhen Zhang, Sophia Xiao Pu, Xin Eric Wang · PDF
  47. Hierarchical Agenda Reasoning for Strategic Multi-Turn Dialogue Agents

    Marwa Abdulhai, Ryan Cheng, Aryansh Shrivastava, Aviral Kumar, Sergey Levine · PDF
  48. Human-Guided Harm Recovery for Computer Use Agents

    Christy Li, Sky CH-Wang, Andi Peng, Andreea Bobu · PDF
  49. InfoPO: Information-Driven Policy Optimization for User-Centric Agents

    Fanqi Kong, Jiayi Zhang, Mingyi Deng, Chenglin Wu, Yuyu Luo, Bang Liu · PDF
  50. Inherited Goal Drift: Contextual Pressure Can Undermine Agentic Goals

    Achyutha Menon, Magnus Saebo, Tyler Crosse, Spencer Gibson, Eyon Jang, Diogo Cruz · PDF
  51. Intrinsic Credit Assignment for Long Horizon Interaction

    Ilze Amanda Auzina, Joschka Strüber, Sergio Hernández-Gutiérrez, Shashwat Goel, Ameya Prabhu, Matthias Bethge · PDF
  52. Learning Agent Routing From Early Experience

    Yimin Wang, Jiahao Qiu, Xuan Qi, Xinzhe Juan, Jingzhe Shi, Zelin Zhao, Hongru WANG, Shilong Liu, Mengdi Wang · PDF
  53. Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks

    Cheng Yang, Xuemeng Yang, Licheng Wen, Daocheng Fu, Jianbiao Mei, Rong Wu, Pinlong Cai, Yufan Shen, Nianchen Deng, Jia Xu, Botian Shi, Yu Qiao, Haifeng Li · PDF
  54. Learning Physical Principles from Interaction: Self-Evolving Embodied Planning via Test-Time Memory

    Haoyang Li, Yang You, Hao Su, Leonidas Guibas · PDF
  55. Learning to Evolve: Scaling Open-Ended Discovery with Relative-Progress RL

    Xuan Li, Zhanke Zhou, Zongze Li, Jiangchao Yao, Bo Han · PDF
  56. Learning to Self-Evolve

    Xiaoyin Chen, Canwen Xu, Yite Wang, Boyi Liu, Zhewei Yao, Yuxiong He · PDF
  57. Learning Transferable Skills in Action RPGs via Directed Skill Graphs and Selective Adaptation

    Ali Najar · PDF
  58. Learning What to Learn: Curriculum Curation for Test-Time Agent Learning

    Qizheng Zhang, Sherry Ruan, Shubhangi Upasani, Fenglu Hong, Changxiu Ji, Changran Hu, Bo Li, Hanchen Li, Kunle Olukotun · PDF
  59. LHAW: Controllable Underspecification for Long-Horizon Tasks

    George Pu, Michael S. Lee, Udari Madhushani Sehwag, David J. Lee, Bryan Zhu, Yash Maurya, Mohit Raghavendra, Yuan Xue, Samuel Marc Denton · PDF
  60. Lifelong Contextual Safety Alignment at Test Time for Multi-Modal LLMs

    Ce Zhang, Jinxi He, Junyi He, Katia P. Sycara, Yaqi Xie · PDF
  61. Log-Augmented Generation: Scaling Test-Time Reasoning with Reusable Computation

    Peter Baile Chen, Yi Zhang, Dan Roth, Samuel Madden, Jacob Andreas, Mike Cafarella · PDF
  62. MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation

    lu Yang, Zelai Xu, Minyang Xie, Jiaxuan Gao, zhao shok, Yu Wang, Yi Wu · PDF
  63. Mem$^2$Evolve: Towards Self-Evolving Agents via Co-Evolutionary Capability Expansion and Experience Distillation

    Zihao Cheng, Zeming Liu, Yingyu Shan, Xinyi Wang, Xiangrong Zhu, Yunpu Ma, Hongru WANG, Yuhang Guo, Wei Lin, Yunhong Wang · PDF
  64. MemoryCD: Benchmarking Long-Context User Memory of LLM Agents for Lifelong Cross-Domain Personalization

    Weizhi Zhang, Xiaokai Wei, Wei-Chieh Huang, Zheng Hui, Chen Wang, Michelle Gong, Philip S. Yu · PDF
  65. MindZero: Learning Online Mental Reasoning With Zero Annotations

    Shunchi Zhang, Jin Lu, Chuanyang Jin, Yichao Zhou, Zhining Zhang, Tianmin Shu · PDF
  66. MobileMem: Evaluating Long-Horizon Memory for Language Agents in Real-World Mobile Environments

    Xinle Deng, Yida Xue, Yijun Chen, Mingjun Mao, Ruobin Zhong, Buqiang Xu, Jizhan Fang, Haoming Xu, Tingwei Wu, Yajing Xu, Shumin Deng, Haofen Wang, Huajun Chen, Ningyu Zhang · PDF
  67. Narrow Fine-Tuning Erodes Safety Alignment in Vision-Language Agents

    Idhant Gulati, Shivam Raval · PDF
  68. Navigating the Cost-Performance Pareto Frontier of Test-Time LLM Agent Adaptation

    Konrad Szafer, Xiaozhe Yao, Maximilian Böther, Gregor Bachmann, Tiago Pimentel, Ana Klimovic · PDF
  69. Not All Clients Are Equal: Collaborative Model Personalization on Heterogeneous Multi-Modal Clients

    Minhyuk Seo, Taeheon Kim, Hankook Lee, Jonghyun Choi, Tinne Tuytelaars · PDF
  70. Noticing the Watcher: LLM Agents Can Infer CoT Monitoring from Blocking Feedback

    Thomas Jiralerspong, Flemming Kondrup, Yoshua Bengio · PDF
  71. OceanGym: Evaluating Language-Grounded Embodied Agents in Underwater Environments

    Yida Xue, Mingjun Mao, Xiangyuan Ru, Yuqi Zhu, Baochang Ren, Shuofei Qiao, Mengru Wang, Shumin Deng, Xinyu An, Ningyu Zhang, Ying Chen, Huajun Chen · PDF
  72. On Group Relative Policy Optimization Collapse in Agent Search: The Lazy Likelihood-Displacement

    Wenlong Deng, Yushu Li, Boying Gong, Yi Ren, Christos Thrampoulidis, Xiaoxiao Li · PDF
  73. On Path to Multimodal Historical Reasoning: HistBench and HistAgent

    Jiahao Qiu, Fulian Xiao, Yimin Wang, Yuchen Mao, Yijia Chen, Xinzhe Juan, Siran Wang, Xuan Qi, Tongcheng Zhang, Zixin Yao, Jiacheng Guo, Yifu Lu, Charles Argon, Jundi Cui, Daixin Chen, Junran Zhou, Shuyao Zhou, Zhanpeng Zhou, Ling Yang, Shilong Liu, Hongru WANG, Kaixuan Huang, xun jiang, Xi Gao, Mengdi Wang · PDF
  74. One Model, Many Goals: Meta-Learning Preference-Conditioned Alignment for Lifelong LLM Agents

    Fatemeh Nourzad, Daouda Sow, Yingbin Liang, Ming Shi, Ming Zhang, Yunxuan Li, Eylem Ekici, Ness Shroff · PDF
  75. PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution

    Minghao Yan, Bo Peng, Benjamin Coleman, Ziqi Chen, Zhouhang Xie, Shuo Chen, Zhankui He, Noveen Sachdeva, Isabella Ye, Weili Wang, Chi Wang, Ed H. Chi, Fernando Pereira, Wang-Cheng Kang, Derek Zhiyuan Cheng, Beidou Wang · PDF
  76. Paying Less Generalization Tax: A Cross-Domain Generalization Study of RL Training for LLM Agents

    Zhihan Liu, Lin Guan, Yixin Nie, Kai Zhang, Zoey Hao, Lin Chen, Asli Celikyilmaz, Zhaoran Wang, Na Zhang · PDF
  77. PEARL: Self-Evolving Assistant for Time Management

    Bingxuan Li, Jeonghwan Kim, Cheng Qian, Xiusi Chen, Eitan Anzenberg, Niran Kundapur, Heng Ji · PDF
  78. PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents

    Ke Yang, Zixi Chen, Xuan He, Jize Jiang, Michel Galley, Chenglong Wang, Jianfeng Gao, Jiawei Han, ChengXiang Zhai · PDF
  79. PolicyBank: Evolving Policy Understanding For Evolving Agents

    Jihye Choi, Jinsung Yoon, Long T. Le, Somesh Jha, Tomas Pfister · PDF
  80. Pushing Forward Pareto Frontiers of Proactive Agents with Behavioral Agentic Optimization

    Yihang Yao, Zhepeng Cen, Haohong Lin, Shiqi Liu, Zuxin Liu, Jiacheng Zhu, Zhang-Wei Hong, Laixi Shi, Ding Zhao · PDF
  81. ReCreate: Reasoning and Creating Domain Agents Driven by Experience

    Zhezheng Hao, Hong Wang, Jian Luo, Jianqing Zhang, Yuyan Zhou, Qiang Lin, Can Wang, Hande Dong, Jiawei Chen · PDF
  82. ReMix: Reinforcement Routing for Mixtures of LoRAs in LLM Finetuning

    Ruizhong Qiu, Hanqing Zeng, Yinglong Xia, Yiwen Meng, Ren Chen, Jiarui Feng, Dongqi Fu, Qifan Wang, Jiayi Liu, Jun Xiao, Xiangjun Fan, Benyu Zhang, Hong Li, Zhining Liu, Hyunsik Yoo, Zhichen Zeng, Tianxin Wei, Hanghang Tong · PDF
  83. Residual Off-Policy RL for Finetuning Behavior Cloning Policies

    Lars Lien Ankile, Zhenyu Jiang, Rocky Duan, Guanya Shi, Pieter Abbeel, Anusha Nagabandi · PDF
  84. ScenDroid: A Scenario-Level Benchmark for Long-Horizon, Time-Evolving GUI Agents

    Zhe Wu, Yongxin Kang, Dabin Sheng, Junliang Xing, Guokun Wu, Derek Yuen, Donglin Mo, Yuheng Jing, Kai Li, Weilin Luo, Kun Shao, Yuanchun Shi · PDF
  85. Self-Distillation Enables Continual Learning

    Idan Shenfeld, Mehul Damani, Jonas Hübotter, Pulkit Agrawal · PDF
  86. Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models

    Siyan Zhao, Zhihui Xie, Mengchen Liu, Jing Huang, Guan Pang, Feiyu Chen, Aditya Grover · PDF
  87. Self-Questioning Language Models

    Lili Chen, Mihir Prabhudesai, Katerina Fragkiadaki, Hao Liu, Deepak Pathak · PDF
  88. SimpleMem: Efficient Lifelong Memory for LLM Agents

    Jiaqi Liu, Yaofeng Su, Peng Xia, Siwei Han, Zeyu Zheng, Cihang Xie, Mingyu Ding, Huaxiu Yao · PDF
  89. SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

    Peng Xia, Jianwen Chen, Hanyang Wang, Jiaqi Liu, Kaide Zeng, Yu Wang, Siwei Han, Yiyang Zhou, Xujiang Zhao, Haifeng Chen, Zeyu Zheng, Cihang Xie, Huaxiu Yao · PDF
  90. Streaming Memory Benchmark: Stage-level Diagnosis with Evidence Dependency Control

    Guanming Liu, Haoran Yin, LITIANCHEN, Sikuan Yan, Hongru WANG, Baian Chen, Xiaoteng Ma · PDF
  91. SWITCH: Benchmarking Interaction and Verification on Real-World Interfaces in Lifelong Embodied Agents

    Jieru Lin, Zhiwei Yu, Börje F. Karlsson · PDF
  92. The Agent's First Day: Benchmarking Learning, Exploration, and Scheduling in the Workplace Scenarios

    Daocheng Fu, Jianbiao Mei, Rong Wu, Xuemeng Yang, Jia Xu, Yufan Shen, Ding Wang, Pinlong Cai, Yong Liu, Licheng Wen, Botian Shi · PDF
  93. The Hidden Costs of Domain Fine-Tuning: Pii-Bearing Data Degrades Safety and Increases Leakage

    Jayesh Choudhari, Piyush Singh · PDF
  94. TSR: Trajectory‑Search Rollouts for Multi‑Turn RL of LLM Agents

    Aladin Djuhera, Swanand Ravindra Kadhe, Farhan Ahmed, Heiko Ludwig, Holger Boche · PDF
  95. TTCS: Test-Time Curriculum Synthesis for Self-Evolving

    Chengyi Yang, Zhishang Xiang, Yunbo Tang, Zongpei Teng, Chengsong Huang, Yuhan Liu, Jinsong Su · PDF
  96. Understanding Knowledge Acquisition and Release in Language Models via Circuits

    Kiran Raja, Arav Maheria, Andrew Bae, Alan Sun · PDF
  97. Understanding Reasoning Collapse in Multi-Turn Agent Reinforcement Learning

    Zihan Wang, Chi Gui, Xing Jin, Qineng Wang, Licheng Liu, Kangrui Wang, Shiqi Chen, Linjie Li, Zhengyuan Yang, Pingyue Zhang, Yiping Lu, Jiajun Wu, Li Fei-Fei, Lijuan Wang, Yejin Choi, Manling Li · PDF
  98. Universe Routing: Why Self-Evolving Agents Need Epistemic Control

    Zhaohui Geoffrey Wang · PDF
  99. Verifying the Verifiers: Failure Attribution for Agentic Benchmark Diagnostics and Training Data Curation

    Jesse Hu, Pratyush Shukla, Ke Huang · PDF
  100. VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

    Dongfu Jiang, Yi Lu, Zhuofeng Li, Zhiheng Lyu, Ping Nie, Haozhe Wang, Alex Su, Hui Chen, Kai Zou, Chao Du, Tianyu Pang, Wenhu Chen · PDF
  101. Weasel: Out-of-Domain Generalization for Web Agents via Importance-Diversity Data Selection

    Fatemeh Pesaran zadeh, Seyeon Choi, Xing Han Lù, Siva Reddy, Gunhee Kim · PDF
  102. When Drafts Evolve: Speculative Decoding Meets Online Learning

    Yu-Yang Qian, Hao-Cong Wu, Yichao Fu, Hao Zhang, Peng Zhao · PDF
  103. Which Memory Operation Drives Recovery? A Factorial Study of Retrieve, Write, and Manage Adaptation under Domain Shift

    Zhaoxiang Feng, Mingyang Yao, Charlie Sun, David Scott Lewis · PDF
  104. Zombie Agents: Persistent Control of Self-Evolving LLM Agents via Self-Reinforcing Injections

    XIANGLIN YANG, Yufei He, Shuo Ji, Bryan Hooi, Jin Song Dong · PDF