ICML 2024 Past Large language modelsReinforcement learning
Automated Reinforcement Learning: Exploring Meta-Learning, AutoML, and LLMs
AutoRL@ICML 2024
- Submission deadline
- Jun 1, 2024, 12:59 UTC imported from OpenReview — check the website for extensions
- Submission portal
- OpenReview
- Notes
- Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).
Accepted papers (26)
Fetched from OpenReview (v2) on 2026-06-10.
-
Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning
-
Assessing the Zero-Shot Capabilities of LLMs for Action Evaluation in RL
-
BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL
-
Can Learned Optimization Make Reinforcement Learning Less Difficult?
-
Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels
-
Conditional Meta-Reinforcement Learning with State Representation
-
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning
-
Discovering Preference Optimization Algorithms with and for Large Language Models
-
Distilling LLMs’ Decomposition Abilities into Compact Language Models
-
DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning
-
GPT-HyperAgent: Scalable Uncertainty Estimation and Exploration for Foundation Model Decisions
-
Higher Order and Self-Referential Evolution for Population-based Methods
-
Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models
-
Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?
-
Learning In-Context Decision Making with Synthetic MDPs
-
Recursive Introspection: Teaching Foundation Model Agents How to Self-Improve
-
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
-
Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity
-
Skill-Enhanced Reinforcement Learning Acceleration from Demonstrations
-
Snapshot Reinforcement Learning: Leveraging Prior Trajectories for Efficiency
-
Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search
-
STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making
-
Trace is the New AutoDiff — Unlocking Efficient Optimization of Computational Workflows
-
Unfamiliar Finetuning Examples Control How Language Models Hallucinate
-
Vision-Language Models Provide Promptable Representations for Reinforcement Learning
-
XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX