ICML 2024PastLarge language modelsReinforcement learning

Automated Reinforcement Learning: Exploring Meta-Learning, AutoML, and LLMs

AutoRL@ICML 2024

Official website ↗OpenReview venue ↗See all ICML workshops →✎ Edit this entry

Submission deadline: Jun 1, 2024, 12:59 UTC
imported from OpenReview — check the website for extensions
Submission portal: OpenReview
Notes: Topics were auto-suggested and may be imprecise — edits welcome.

Accepted papers (26)

Fetched from OpenReview (v2) on 2026-06-10.

Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning
Théo Vincent, Fabian Wahren, Jan Peters, Boris Belousov, Carlo D'Eramo · PDF
Assessing the Zero-Shot Capabilities of LLMs for Action Evaluation in RL
Eduardo Pignatelli, Johan Ferret, Davide Paglieri, Samuel Coward, Tim Rocktäschel, Edward Grefenstette, Laura Toni · PDF
BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL
Yu Heng Hung, Kai-Jie Lin, Yu-Heng Lin, Chien-Yi Wang, Ping-Chun Hsieh · PDF
Can Learned Optimization Make Reinforcement Learning Less Difficult?
Alexander D. Goldie, Chris Lu, Matthew Thomas Jackson, Shimon Whiteson, Jakob Nicolaus Foerster · PDF
Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels
Zhuorui Ye, Stephanie Milani, Fei Fang, Geoff Gordon · PDF
Conditional Meta-Reinforcement Learning with State Representation
Yuxuan Sun, Laura Toni, Yiannis Andreopoulos · PDF
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning
Yifei Zhou, Hao Bai, Mert Cemri, Jiayi Pan, Alane Suhr, Sergey Levine, Aviral Kumar · PDF
Discovering Preference Optimization Algorithms with and for Large Language Models
Chris Lu, Samuel Holt, Claudio Fanconi, Alex James Chan, Jakob Nicolaus Foerster, Mihaela van der Schaar, Robert Tjarko Lange · PDF
Distilling LLMs’ Decomposition Abilities into Compact Language Models
Denis Tarasov, Kumar Shridhar · PDF
DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning
Anthony Liang, Guy Tennenholtz, ChihWei Hsu, Yinlam Chow, Erdem Biyik, Craig Boutilier · PDF
GPT-HyperAgent: Scalable Uncertainty Estimation and Exploration for Foundation Model Decisions
Yingru Li, Jiawei Xu, Zhi-Quan Luo · PDF
Higher Order and Self-Referential Evolution for Population-based Methods
Samuel Coward, Chris Lu, Alistair Letcher, Minqi Jiang, Jack Parker-Holder, Jakob Nicolaus Foerster · PDF
Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models
Cong Lu, Shengran Hu, Jeff Clune · PDF
Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?
Denis Tarasov, Kirill Brilliantov, Dmitrii Kharlapenko · PDF
Learning In-Context Decision Making with Synthetic MDPs
Akarsh Kumar, Chris Lu, Louis Kirsch, Phillip Isola · PDF
Recursive Introspection: Teaching Foundation Model Agents How to Self-Improve
Yuxiao Qu, Tianjun Zhang, Naman Garg, Aviral Kumar · PDF
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
Shenao Zhang, Donghan Yu, Hiteshi Sharma, Ziyi Yang, Shuohang Wang, Hany Hassan Awadalla, Zhaoran Wang · PDF
Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity
Vahid Balazadeh, Keertana Chidambaram, Viet Nguyen, Rahul Krishnan, Vasilis Syrgkanis · PDF
Skill-Enhanced Reinforcement Learning Acceleration from Demonstrations
Hanping Zhang, Yuhong Guo · PDF
Snapshot Reinforcement Learning: Leveraging Prior Trajectories for Efficiency
Yanxiao Zhao, Yangge Qian, Tianyi Wang, Jingyang Shan, Xiaolin Qin · PDF
Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search
Jonathan Light, Min Cai, Weiqin Chen, Guanzhi Wang, Xiusi Chen, Wei Cheng, Yisong Yue, Ziniu Hu · PDF
STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making
Chuanhao Li, Runhan Yang, Tiankai Li, Milad Bafarassat, Kourosh Sharifi, Dirk Bergemann, Zhuoran Yang · PDF
Trace is the New AutoDiff — Unlocking Efficient Optimization of Computational Workflows
Ching-An Cheng, Allen Nie, Adith Swaminathan · PDF
Unfamiliar Finetuning Examples Control How Language Models Hallucinate
Katie Kang, Eric Wallace, Claire Tomlin, Aviral Kumar, Sergey Levine · PDF
Vision-Language Models Provide Promptable Representations for Reinforcement Learning
William Chen, Oier Mees, Aviral Kumar, Sergey Levine · PDF
XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX
Alexander Nikulin, Vladislav Kurenkov, Ilya Zisman, Artem Sergeevich Agarkov, Viacheslav Sinii, Sergey Kolesnikov · PDF

Accepted papers (26)

☆Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning

☆Assessing the Zero-Shot Capabilities of LLMs for Action Evaluation in RL

☆BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL

☆Can Learned Optimization Make Reinforcement Learning Less Difficult?

☆Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels

☆Conditional Meta-Reinforcement Learning with State Representation

☆DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

☆Discovering Preference Optimization Algorithms with and for Large Language Models

☆Distilling LLMs’ Decomposition Abilities into Compact Language Models

☆DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning

☆GPT-HyperAgent: Scalable Uncertainty Estimation and Exploration for Foundation Model Decisions

☆Higher Order and Self-Referential Evolution for Population-based Methods

☆Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models

☆Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?

☆Learning In-Context Decision Making with Synthetic MDPs

☆Recursive Introspection: Teaching Foundation Model Agents How to Self-Improve

☆Self-Exploring Language Models: Active Preference Elicitation for Online Alignment

☆Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity

☆Skill-Enhanced Reinforcement Learning Acceleration from Demonstrations

☆Snapshot Reinforcement Learning: Leveraging Prior Trajectories for Efficiency

☆Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search

☆STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making

☆Trace is the New AutoDiff — Unlocking Efficient Optimization of Computational Workflows

☆Unfamiliar Finetuning Examples Control How Language Models Hallucinate

☆Vision-Language Models Provide Promptable Representations for Reinforcement Learning

☆XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX

Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning

Assessing the Zero-Shot Capabilities of LLMs for Action Evaluation in RL

BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL

Can Learned Optimization Make Reinforcement Learning Less Difficult?

Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels

Conditional Meta-Reinforcement Learning with State Representation

DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

Discovering Preference Optimization Algorithms with and for Large Language Models

Distilling LLMs’ Decomposition Abilities into Compact Language Models

DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning

GPT-HyperAgent: Scalable Uncertainty Estimation and Exploration for Foundation Model Decisions

Higher Order and Self-Referential Evolution for Population-based Methods

Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models

Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?

Learning In-Context Decision Making with Synthetic MDPs

Recursive Introspection: Teaching Foundation Model Agents How to Self-Improve

Self-Exploring Language Models: Active Preference Elicitation for Online Alignment

Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity

Skill-Enhanced Reinforcement Learning Acceleration from Demonstrations

Snapshot Reinforcement Learning: Leveraging Prior Trajectories for Efficiency

Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search

STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making

Trace is the New AutoDiff — Unlocking Efficient Optimization of Computational Workflows

Unfamiliar Finetuning Examples Control How Language Models Hallucinate

Vision-Language Models Provide Promptable Representations for Reinforcement Learning

XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX