NeurIPS 2025 Past Robotics
NeurIPS 2025 Workshop on Embodied World Models for Decision Making
NeurIPS 2025 Workshop EWM
- Submission deadline
- Sep 3, 2025, 23:59 UTC imported from OpenReview — check the website for extensions
- Submission portal
- OpenReview
- Notes
- Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).
Accepted papers (51)
Fetched from OpenReview (v2) on 2026-06-10.
-
A Smooth Sea Never Made a Skilled SAILOR: Robust Imitation via Learning to Search
-
Abstract Sim2Real through Approximate Information States
-
Ada-Diffuser: Latent-Aware Adaptive Diffusion for Decision-Making
-
Adversarial Diffusion for Robust Reinforcement Learning
-
Avi: A 3D Vision-Language Action Model Architecture generating Action from Volumetric Inference
-
Beyond Experience: Fictive Learning as an Inherent Advantage of World Models
-
Bridging the Sim-to-Real Gap in Humanoid Dynamics via Learned Nonlinear Operators
-
Communicating Plans, Not Percepts: Scalable Multi-Agent Coordination with Embodied World Models
-
Coupled Distributional Random Expert Distillation for World Model Online Imitation Learning
-
CRISP: Contact-guided Real2Sim from Monocular Video with Planar Scene Primitives
-
Decoupled Planning and Execution with LLM-Driven World Models for Efficient Reinforcement learning
-
Divide and Merge: Motion and Semantic Learning in End-to-End Autonomous Driving
-
EnerVerse-AC: Envisioning Embodied Environments with Action Condition
-
Exploring exploration with foundation agents in interactive environments
-
FalconWing: An Ultra-Light Fixed-Wing Platform for Indoor Aerial Applications
-
FLAM: Scaling Latent Action World Models with Factorization
-
Foundation Models as World Models: A Foundational Study in Text-Based GridWorlds
-
Generative World Models of Tasks: LLM-Driven Hierarchical Scaffolding for Embodied Agents
-
Geosteering Through the Lens of Decision Transformers: Toward Embodied Sequence Decision-Making
-
HDFlow: Hierarchical Diffusion-Flow Planning for Long-horizon Robotic Assembly
-
How Foundational Skills Influence VLM-based Embodied Agents: A Native Perspective
-
Improvisational Reasoning with Vision-Language Models for Grounded Procedural Planning
-
In-Context Policy Iteration for Dynamic Manipulation
-
Latent Weight Diffusion: Generating reactive policies instead of trajectories
-
Learning to Focus: Prioritizing Informative Histories with Structured Attention Mechanisms in Partially Observable Reinforcement Learning
-
LLM-Guided Probabilistic Program Induction for POMDP Model Estimation
-
Mobile Manipulation with Active Inference for Long-Horizon Rearrangement Tasks
-
NinA: Normalizing Flows in Action. Training VLA Models with Normalizing Flows
-
OpenGVL - Benchmarking Visual Temporal Progress for Data Curation
-
Opinion: A Unified World Model is the cornerstone for integrating perception, reasoning, and decision-making in embodied AI
-
Opinion: How Can Causal AI Benefit World Models?
-
Opinion: Learning Intuitive Physics May Require More Than Visual Data
-
Opinion: Small VLAs Self-Learn Consistency
-
Opinion: Towards Unified Expressive Policy Optimization for Robust Robot Learning
-
Plan Verification for LLM-Based Embodied Task Completion Agents
-
PolicyGRID: Acting to Understand, Understanding to Act
-
RDAR: Reward-Driven Agent Relevance Estimation for Autonomous Driving
-
Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics
-
ROPES: Robotic Pose Estimation via Score-based Causal Representation Learning
-
ScenePhys — Controllable Physics Videos for World-Model Evaluation
-
Sim-to-Real Contact-Rich Pivoting via Optimization-Guided RL with Vision and Touch
-
SpatialThinker: Reinforcing 3D Reasoning in Multimodal LLMs via Spatial Rewards
-
SPUR: Scaling Reward Learning from Human Demonstrations
-
Stable Planning through Aligned Representations in Model-Based Reinforcement Learning
-
Steering Diffusion Policies with Value-Guided Denoising
-
The Physical Basis of Prediction: World Model Formation in Neural Organoids via an LLM-Generated Curriculum
-
Towards Fine-tuning a Small Vision-Language Model for Aerial Navigation
-
ViPRA: Video Prediction for Robot Actions
-
Vision-Language Reasoning for Burn Depth Assessment with Structured Diagnostic Hypotheses
-
VLA-OS: Structuring and Dissecting Planning Representations and Paradigms in Vision-Language-Action Models
-
WHALE: Towards Generalizable and Scalable World Models for Embodied Decision-making