ICML 2024 Past Reinforcement learning
ICML 2024 Workshop: Foundations of Reinforcement Learning and Control -- Connections and Perspectives
FoRLaC
- Submission deadline
- May 30, 2024, 11:59 UTC imported from OpenReview — check the website for extensions
- Submission portal
- OpenReview
- Notes
- Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).
Accepted papers (72)
Fetched from OpenReview (v2) on 2026-06-10.
-
$\alpha$-Fair Contextual Bandits
-
A Best-of-both-worlds Algorithm for Bandits with Delayed Feedback with Robustness to Excessive Delays
-
A Policy Optimization Approach to the Solution of Unregularized Mean Field Games
-
A Pontryagin Perspective on Reinforcement Learning
-
A safe exploration approach to constrained Markov decision processes
-
A Simple and Adaptive Learning Rate for FTRL in Online Learning with Minimax Regret of $\Theta(T^{2/3})$ and its Application to Best-of-Both-Worlds
-
A Variational Formulation of Reinforcement Learning in Infinite-Horizon Markov Decision Processes
-
Adaptive Experimental Design for Policy Learning: Contextual Best Arm Identification
-
Bandits with Abstention under Expert Advice
-
Bandits with Preference Feedback: A Stackelberg Game Perspective
-
Bridging Distributional and Risk-Sensitive Reinforcement Learning: Balancing Statistical, Computational, and Risk Considerations
-
Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage
-
Causal Bandits: The Pareto Optimal Frontier of Adaptivity, a Reduction to Linear Bandits, and Limitations around Unknown Marginals
-
Certifying robustness to adaptive data poisoning
-
Chained Information-Theoretic Bounds and Tight Regret Rate for Linear Bandit Problems
-
Combining Neural Networks and Symbolic Regression for Analytical Lyapunov Function Discovery
-
Compatible Gradient Approximations for Actor-Critic Algorithms
-
CPeSFA: Empowering SFs for Policy Learning and Transfer in Continuous Action Spaces
-
DARE: The Deep Adaptive Regulator for Control of Uncertain Continuous-Time Systems
-
DeePC-Hunt: Data-enabled Predictive Control Hyperparameter Tuning via Differentiable Optimization
-
Defending Against Unknown Corrupted Agents: Reinforcement Learning of Adversarially Robust Nash Equilibria
-
Distributional Monte-Carlo Planning with Thompson Sampling in Stochastic Environments
-
Essentially Sharp Estimates on the Entropy Regularization Error in Discounted Markov Decision Processes
-
Event-Based Federated Q-Learning
-
Exploring Integrality Grip for Mixed-integer Programming by MCTS Planning
-
Finite Sample Identification: From Frequency to Time Domain
-
Finite-time convergence to an $\epsilon$-efficient Nash equilibrium in potential games
-
Growing Q-Networks: Solving Continuous Control Tasks with Adaptive Control Resolution
-
Hierarchical Reinforcement Learning and Model Predictive Control for Strategic Motion Planning in Autonomous Racing
-
Hybrid Recurrent Models Support Emergent Descriptions for Hierarchical Planning and Control
-
Identifiable latent bandits: Combining observational data and exploration for personalized healthcare
-
Improved Algorithms for Contextual Dynamic Pricing
-
Learning HJB Viscosity Solutions with PINNs for Continuous-Time Reinforcement Learning
-
Learning Nash Equilibria in Zero-Sum Markov Games: A Single-Timescale Algorithm Under Weak Reachability
-
Learning to Explore with Lagrangians for Bandits under Unknown Constraints
-
Learning When to Trust the Expert for Guided Exploration in RL
-
Mitigating Partial Observability in Sequential Decision Processes via the Lambda Discrepancy
-
Model Based Diffusion for Trajectory Optimization
-
Multiple-policy Evaluation via Density Estimation
-
NEORL: Efficient Exploration for Nonepisodic RL
-
Neural Dueling Bandits
-
Non-ergodicity in reinforcement learning: robustness via ergodicity transformations
-
Non-Linear $H_\infty$ Robustness Guarantees for Neural Network Policies
-
On PI Controllers for Updating Lagrange Multipliers in Constrained Optimization
-
On Robust Reinforcement Learning with Lipschitz-Bounded Policy Networks
-
Online Optimization of Closed-Loop Control Systems
-
Online Performance Optimization of Nonlinear Systems: A Gray-Box Approach
-
Optimality of Stationary Policies in Risk-averse Total-reward MDPs with EVaR
-
Optimistic Information Directed Sampling
-
Partial Structure Discovery is Sufficient for No-regret Learning in Causal Bandits
-
Pink Noise LQR: How does Colored Noise affect the Optimal Policy in RL?
-
Power Mean Estimation in Stochastic Monte-Carlo Tree Search
-
Preference Elicitation for Offline Reinforcement Learning
-
Randomized Confidence Bounds for Stochastic Partial Monitoring
-
Recommender System Design via Online Feedback Optimization
-
Recurrent Natural Policy Gradient for POMDPs
-
Reinforcement Learning of Adaptive Acquisition Policies for Inverse Problems
-
Reinforcement Learning with Lookahead Information
-
Reinforcement Learning with Quasi-Hyperbolic Discounting
-
Robust Best-of-Both-Worlds Gap Estimators Based on Importance-Weighted Sampling
-
Safe online nonstochastic control from data
-
Safe Reinforcement Learning with Contrastive Risk Prediction
-
SMX: Sequential Monte Carlo Planning for Expert Iteration
-
Sum-Max Submodular Bandits
-
The Minimax Regret of Sequential Probability Assignment, Contextual Shtarkov Sums, and Contextual Normalized Maximum Likelihood
-
The Value of Reward Lookahead in Reinforcement Learning
-
Tight Bounds for Online Convex Optimization with Adversarial Constraints
-
Towards Empowerment Gain through Causal Structure Learning in Model-Based RL
-
Truly No-Regret Learning in Constrained MDPs
-
Uniform Last-Iterate Guarantee for Bandits and Reinforcement Learning
-
Variance-Dependent Regret Bounds for Nonstationary Linear Bandits
-
When is Mean-Field Reinforcement Learning Tractable and Relevant?