CVPR 2025 Past Large language modelsAgentsRobotics

Workshop on Foundation Models Meet Embodied Agents at CVPR 2025

FMEA @ CVPR 2025

Submission deadline
May 26, 2025, 19:00 UTC
imported from OpenReview — check the website for extensions
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (19)

Fetched from OpenReview (v2) on 2026-06-10.

  1. 3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model

    Wenbo Hu, Yining Hong, Yanjun Wang, Leison Gao, Zibu Wei, Xingcheng Yao, Nanyun Peng, Yonatan Bitton, Idan Szpektor, Kai-Wei Chang · PDF
  2. AetherVision-Bench: An Open-Vocabulary RGB-Infrared Benchmark for Multi-Angle Segmentation across Aerial and Ground Perspectives

    Aniruddh Sikdar, Aditya Gandhamal, Suresh Sundaram · PDF
  3. Beyond Needle(s) in the Embodied Haystack: Environment, Architecture, and Training Considerations for Long Context Reasoning

    Bosung Kim, Prithviraj Ammanabrolu · PDF
  4. Embodied AI with Knowledge Graphs: Material-Aware Obstacle Handling for Autonomous Agents

    Ayush Bheemaiah, Seungyong Yang · PDF
  5. Episodic Memory Banks for Lifelong Robot Learning: A Case Study Focusing on Household Navigation and Manipulation

    Zichao Li · PDF
  6. Human-like Navigation in a World Built for Humans

    Bhargav Chandaka, Gloria X. Wang, Haozhe Chen, Henry Che, Albert J. Zhai, Shenlong Wang · PDF
  7. Interactive Post-Training for Vision-Language-Action Models

    Shuhan Tan, Kairan Dou, Yue Zhao, Philipp Kraehenbuehl · PDF
  8. Mem2Ego: Empowering Vision-Language Models with Global-to-Ego Memory for Long-Horizon Embodied Navigation

    Lingfeng Zhang, Yuecheng Liu, Zhanguang Zhang, Matin Aghaei, Yaochen Hu, Hongjian Gu, Mohammad Ali Alomrani, David Gamaliel Arcos Bravo, Raika Karimi, Atia Hamidizadeh, Haoping Xu, Guowei Huang, zhanpeng zhang, Tongtong Cao, Weichao Qiu, Xingyue Quan, Jianye HAO, Yuzheng Zhuang, Yingxue Zhang · PDF
  9. Model-Based Policy Adaptation for Closed-Loop End-to-End Autonomous Driving

    Haohong Lin, Yunzhi Zhang, Wenhao Ding, Jiajun Wu, Ding Zhao · PDF
  10. One Demo Is All It Takes: Planning Domain Derivation with LLMs from A Single Demonstration

    Jinbang Huang, Yixin Xiao, Zhanguang Zhang, Mark Coates, Jianye HAO, Yingxue Zhang · PDF
  11. Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks

    Zaijing Li, Yuquan Xie, Rui Shao, Gongwei Chen, Dongmei Jiang, Liqiang Nie · PDF
  12. Robotic Manipulation by Imitating Generated Videos Without Physical Demonstrations

    Shivansh Patel, Shraddhaa Mohan, Hanlin Mai, Unnat Jain, Svetlana Lazebnik, Yunzhu Li · PDF
  13. SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation

    Haoquan Fang, Markus Grotz, Wilbert Pumacay, Yi Ru Wang, Dieter Fox, Ranjay Krishna, Jiafei Duan · PDF
  14. Scalable Decision-Making in Stochastic Environments through Learned Temporal Abstraction

    Baiting Luo, Abhishek Dubey, Ayan Mukhopadhyay · PDF
  15. SemNav: A Model-Based Planner for Zero-Shot Object Goal Navigation Using Vision-Foundation Models

    Arnab Debnath, Gregory J. Stein, Jana Kosecka · PDF
  16. Slot-Level Robotic Placement via Visual Imitation from Single Human Video

    Dandan Shan, Kaichun Mo, Wei Yang, Yu-Wei Chao, David Fouhey, Dieter Fox, Arsalan Mousavian · PDF
  17. TRAVEL: Training-Free Retrieval and Alignment for Vision-and-Language Navigation

    Navid Rajabi, Jana Kosecka · PDF
  18. Visual Planning: Let's Think Only with Images

    Yi Xu, Chengzu Li, Han Zhou, Xingchen Wan, Caiqi Zhang, Anna Korhonen, Ivan Vulić · PDF
  19. ZeroMimic: Distilling Robotic Manipulation Skills from Web Videos

    Junyao Shi, Zhuolun Zhao, Tianyou Wang, Ian Pedroza, Amy Luo, Jie Wang, Yecheng Jason Ma, Dinesh Jayaraman · PDF