ICML 2024 Past Large language modelsReinforcement learning

Automated Reinforcement Learning: Exploring Meta-Learning, AutoML, and LLMs

AutoRL@ICML 2024

Submission deadline
Jun 1, 2024, 12:59 UTC
imported from OpenReview — check the website for extensions
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (26)

Fetched from OpenReview (v2) on 2026-06-10.

  1. Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning

    Théo Vincent, Fabian Wahren, Jan Peters, Boris Belousov, Carlo D'Eramo · PDF
  2. Assessing the Zero-Shot Capabilities of LLMs for Action Evaluation in RL

    Eduardo Pignatelli, Johan Ferret, Davide Paglieri, Samuel Coward, Tim Rocktäschel, Edward Grefenstette, Laura Toni · PDF
  3. BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL

    Yu Heng Hung, Kai-Jie Lin, Yu-Heng Lin, Chien-Yi Wang, Ping-Chun Hsieh · PDF
  4. Can Learned Optimization Make Reinforcement Learning Less Difficult?

    Alexander D. Goldie, Chris Lu, Matthew Thomas Jackson, Shimon Whiteson, Jakob Nicolaus Foerster · PDF
  5. Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels

    Zhuorui Ye, Stephanie Milani, Fei Fang, Geoff Gordon · PDF
  6. Conditional Meta-Reinforcement Learning with State Representation

    Yuxuan Sun, Laura Toni, Yiannis Andreopoulos · PDF
  7. DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

    Yifei Zhou, Hao Bai, Mert Cemri, Jiayi Pan, Alane Suhr, Sergey Levine, Aviral Kumar · PDF
  8. Discovering Preference Optimization Algorithms with and for Large Language Models

    Chris Lu, Samuel Holt, Claudio Fanconi, Alex James Chan, Jakob Nicolaus Foerster, Mihaela van der Schaar, Robert Tjarko Lange · PDF
  9. Distilling LLMs’ Decomposition Abilities into Compact Language Models

    Denis Tarasov, Kumar Shridhar · PDF
  10. DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning

    Anthony Liang, Guy Tennenholtz, ChihWei Hsu, Yinlam Chow, Erdem Biyik, Craig Boutilier · PDF
  11. GPT-HyperAgent: Scalable Uncertainty Estimation and Exploration for Foundation Model Decisions

    Yingru Li, Jiawei Xu, Zhi-Quan Luo · PDF
  12. Higher Order and Self-Referential Evolution for Population-based Methods

    Samuel Coward, Chris Lu, Alistair Letcher, Minqi Jiang, Jack Parker-Holder, Jakob Nicolaus Foerster · PDF
  13. Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models

    Cong Lu, Shengran Hu, Jeff Clune · PDF
  14. Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?

    Denis Tarasov, Kirill Brilliantov, Dmitrii Kharlapenko · PDF
  15. Learning In-Context Decision Making with Synthetic MDPs

    Akarsh Kumar, Chris Lu, Louis Kirsch, Phillip Isola · PDF
  16. Recursive Introspection: Teaching Foundation Model Agents How to Self-Improve

    Yuxiao Qu, Tianjun Zhang, Naman Garg, Aviral Kumar · PDF
  17. Self-Exploring Language Models: Active Preference Elicitation for Online Alignment

    Shenao Zhang, Donghan Yu, Hiteshi Sharma, Ziyi Yang, Shuohang Wang, Hany Hassan Awadalla, Zhaoran Wang · PDF
  18. Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity

    Vahid Balazadeh, Keertana Chidambaram, Viet Nguyen, Rahul Krishnan, Vasilis Syrgkanis · PDF
  19. Skill-Enhanced Reinforcement Learning Acceleration from Demonstrations

    Hanping Zhang, Yuhong Guo · PDF
  20. Snapshot Reinforcement Learning: Leveraging Prior Trajectories for Efficiency

    Yanxiao Zhao, Yangge Qian, Tianyi Wang, Jingyang Shan, Xiaolin Qin · PDF
  21. Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search

    Jonathan Light, Min Cai, Weiqin Chen, Guanzhi Wang, Xiusi Chen, Wei Cheng, Yisong Yue, Ziniu Hu · PDF
  22. STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making

    Chuanhao Li, Runhan Yang, Tiankai Li, Milad Bafarassat, Kourosh Sharifi, Dirk Bergemann, Zhuoran Yang · PDF
  23. Trace is the New AutoDiff — Unlocking Efficient Optimization of Computational Workflows

    Ching-An Cheng, Allen Nie, Adith Swaminathan · PDF
  24. Unfamiliar Finetuning Examples Control How Language Models Hallucinate

    Katie Kang, Eric Wallace, Claire Tomlin, Aviral Kumar, Sergey Levine · PDF
  25. Vision-Language Models Provide Promptable Representations for Reinforcement Learning

    William Chen, Oier Mees, Aviral Kumar, Sergey Levine · PDF
  26. XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX

    Alexander Nikulin, Vladislav Kurenkov, Ilya Zisman, Artem Sergeevich Agarkov, Viacheslav Sinii, Sergey Kolesnikov · PDF