CVPR 2026 Past Large language modelsAgentsRobotics

The 2nd Workshop on Foundation Models Meet Embodied Agents at CVPR 2026

FMEA @ CVPR 2026

Submission deadline
May 11, 2026, 23:59 UTC
imported from OpenReview — check the website for extensions
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (34)

Fetched from OpenReview (v2) on 2026-06-10.

  1. $Re^2$: Reflective Rule Induction and Rule-Guided Refinement for Embodied Planning

    Yang Chen, Hong-Jie You, Jie-Jing Shao, Xiao-Wen Yang, Ming Yang, Yu-Feng Li, Lan-Zhe Guo · PDF
  2. A Physics-Grounded Benchmark for Multi-Agent Dynamics in World Models

    Nuo Chen, Lulin Liu, Zihao Li, Ziyao Zeng, Zihao Zhu, Wenyan Cong, Junyuan Hong, Yunhao Yang, Zhengzhong Tu, Yan Wang, Boris Ivanovic, Marco Pavone, Zhangyang Wang, Yang Zhou, Zhiwen Fan · PDF
  3. ADeltaM: An Exploratory Counterfactual Delta-Memory Interface for Egocentric Agents

    liyang ruan, Jiahao Cao · PDF
  4. Automated Skill Optimization via Formal Verification for Embodied Agents

    Yunhao Yang, Neel P. Bhatt, Kevin Wang, Zhangyang Wang, ufuk topcu · PDF
  5. EmbodiedMidtrain: Bridging the Gap between Vision-Language Models and Vision-Language-Action Models via Mid-training

    Yiyang Du, Zhanqiu Guo, Xin Ye, Liu Ren, Chenyan Xiong · PDF
  6. EvoWorld: A World-Model-Centric Framework for Continuous Self-Evolution of Modular Embodied Skills

    Boshi Zhang, Sen Cui, BaoHuaYin, Youyi Kou, Junyu Wu, Zuo Pu, TAO XUE, Zhikang Chen, Shanshan Wei, Min Zhang, Miao Liu, Changshui Zhang, Zhang Tao · PDF
  7. FunFact: Building Probabilistic Functional 3D Scene Graphs via Factor-Graph Reasoning

    Zhengyu Fu, René Zurbrügg, Kaixian Qu, Marc Pollefeys, Marco Hutter, Hermann Blum, Zuria Bauer · PDF
  8. GeoWorld-VLM: Sequential 3D Generation via Evidential Memory

    Renjie Gu, Kaichen Zhou, Yan Luo, Mengyu Wang · PDF
  9. HoMMI: Learning Whole-Body Mobile Manipulation from Human Demonstrations

    Xiaomeng Xu, Jisang Park, Han Zhang, Eric Cousineau, Aditya Bhat, Jose Barreiros, Dian Wang, Jeannette Bohg, Shuran Song · PDF
  10. Inference-Time Planning with Action-Conditioned Video Models for Generalizable Robot Manipulation

    Zhiting Mei, Yanbo Xu, Tenny Yin, Ola Sho, Anirudha Majumdar · PDF
  11. InterPrior: Scaling Generative Control for Physics-Based Human-Object Interactions

    Sirui Xu, Samuel Schulter, Morteza Ziyadi, Xialin He, Xiaohan Fei, Yu-Xiong Wang, Liangyan Gui · PDF
  12. LARE: Low-Attention Region Encoding for Text--Image Retrieval

    Muhammad Kamran J Khan, Abdulmalik Alquwayfili, Faisal AlMeshal, Jumanah Almajnouni, Leena Alotaibi, Huda Abdulhadi Alamri, Raied Aljadaany, Faisal alhajari, Mohammed Alkhrashi, Alreem Almuhrij, Abdullah Aldwyish · PDF
  13. Learning Situated Awareness in the Real World

    Chuhan Li, Rilyn R. Han, Joy Hsu, Yongyuan Liang, Rajiv Dhawan, Jiajun Wu, Ming-Hsuan Yang, Xin Eric Wang · PDF
  14. Making Your Action Policies Interpretable: Mixture of Action Queries

    Suhyung Choi, Youngseok Joo, Hyundo Lee, Kisung Shin, Kyuhwan Shim, Chungwoo Lee, Minjeong Gu, Jun Ki Lee, Byoung-Tak Zhang · PDF
  15. MASER: Modality-Adaptive Specialist Routing for Embodied 3D Spatial Intelligence

    Hilton Raj, Vishnuram AV · PDF
  16. MOSAIC: The Right Modules for Each Task in Embodied Agents

    Kevin Wang, Dweep Trivedi, Vincent Ha, Albert Jiang, Christian Ellis, ufuk topcu, Swarat Chaudhuri, Zhangyang Wang · PDF
  17. Multimodal Causal Subtask Modeling for Scalable VLA Pipelines in Long-Horizon Manipulation

    Yina Jian · PDF
  18. PEFT Methods for Embodied VLM Agents: A Systematic Study and MoE-DoRA

    Vishnuram AV, Hilton Raj · PDF
  19. PhysMem: Scaling Test-Time Memory for Embodied Physical Reasoning

    Haoyang Li, Yang You, Hao Su, Leonidas Guibas · PDF
  20. PInVerify: An Offline Embodied Benchmark for Active Instance Verification

    Yuhang Jiang · PDF
  21. PLanAR: Planning-Language-Grounded Agentic Reasoning for Robot Manipulation

    Pengyuan Guo, Zhonghao Mai, Zhengtong Xu, Kaidi Zhang, Quan Khanh Luu, Heng Zhang, Zichen Miao, Arash Ajoudani, Zachary Kingston, Qiang Qiu, Yu She · PDF
  22. RDD: Retrieval-Based Demonstration Decomposer for Planner Alignment in Long-Horizon Tasks

    Mingxuan Yan, Yuping Wang, Zechun Liu, Jiachen Li · PDF
  23. RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies

    Yinpei Dai, Hongze Fu, Jayjun Lee, Yuejiang Liu, Haoran Zhang, Jianing Yang, Chelsea Finn, Nima Fazeli, Joyce Chai · PDF
  24. RoboPlayground: Democratizing Robotic Evaluation through Structured Physical Domains

    Yi Ru Wang, Carter Ung, Evan Gubarev, Christopher Tan, Siddhartha Srinivasa, Dieter Fox · PDF
  25. Scene2Demo: Self-Evolving Embodied Data Generation via Object-Action Graph

    Xiang Liu, Sen Cui, Guocai Yao, Zhong Cao, Jingheng Ma, Min Zhang, Miao Liu, Changshui Zhang · PDF
  26. Self-Improving Loops for Visual Robotic Planning

    Calvin Luo, Zilai Zeng, Mingxi Jia, Yilun Du, Chen Sun · PDF
  27. Semantic Horizons: Information-Theoretic Limits of Foundation Model-Guided Embodied Planning

    Siddharth Karuturi, Kaustubh S. Bukkapatnam · PDF
  28. Task-Relevant Depth Quality Metrics for Suction Grasping

    Shivansh Inamdar · PDF
  29. Theory of Space: Benchmarking Active Spatial Belief Construction and Revision in Foundation Models for Embodied Agents

    Pingyue Zhang, Zihan Huang, Yue Wang, Jieyu Zhang, Letian Xue, Zihan Wang, Qineng Wang, Keshigeyan Chandrasegaran, Ruohan Zhang, Yejin Choi, Ranjay Krishna, Jiajun Wu, Li Fei-Fei, Manling Li · PDF
  30. TIC-VLA: A Think-in-Control Vision-Language-Action Model for Robot Navigation in Dynamic Environments

    Zhiyu Huang, Yun Zhang, Johnson Liu, Rui Song, Chen Tang, Jiaqi Ma · PDF
  31. VL-Nav: A Neuro-Symbolic Approach for Reasoning-based Vision-Language Navigation

    Yi Du, Taimeng Fu, Zhipeng Zhao, Shaoshu Su, Zitong Zhan, Qiwei Du, Zhuoqun Chen, Bowen Li, Chen Wang · PDF
  32. VLS: Steering Pretrained Robot Policies via Vision–Language Models

    Shuo Liu, Ishneet Sukhvinder Singh, Yiqing Xu, Jiafei Duan, Ranjay Krishna · PDF
  33. WFM-Eval: Interpretable Error Diagnostics for Video World Models in Robotics

    Sahil Khose, Mengqi Zhang, Prithvijit Chattopadhyay, Judy Hoffman · PDF
  34. When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning

    Shoubin Yu, Yue Zhang, Zun Wang, Jaehong Yoon, Huaxiu Yao, Mingyu Ding, Mohit Bansal · PDF