NeurIPS 2024 Past Agents

NeurIPS 2024 Workshop on Open-World Agents

NeurIPS 2024 Workshop Open-World Agents

Submission deadline
Sep 21, 2024, 00:01 UTC
imported from OpenReview — check the website for extensions
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (97)

Fetched from OpenReview (v2) on 2026-06-10.

  1. 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination

    Jianing Yang, Xuweiyi Chen, Nikhil Madaan, Madhavan Iyengar, Shengyi Qian, David Fouhey, Joyce Chai · PDF
  2. A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks

    Thomas Schmied, Thomas Adler, Vihang Prakash Patil, Maximilian Beck, Korbinian Pöppel, Johannes Brandstetter, Günter Klambauer, Razvan Pascanu, Sepp Hochreiter · PDF
  3. A Simplified A Priori Theory Of Meaning, –Nature based AI ‘first principles’–

    Marcus Abundis · PDF
  4. Advancing Agentic Systems: Dynamic Task Decomposition, Tool Integration and Evaluation using Novel Metrics and Dataset

    Shankar Kumar Jeyakumar, Alaa Alameer Ahmad, Adrian Garret Gabriel · PDF
  5. Agent S: An Open Agentic Framework that Uses Computers Like a Human

    Saaket Agashe, Jiuzhou Han, Shuyu Gan, Jiachen Yang, Ang Li, Xin Eric Wang · PDF
  6. Agent-E: From Autonomous Web Navigation to Foundational Design Principles in Agentic Systems

    Tamer Abuelsaad, Deepak Akkil, Prasenjit Dey, Ashish Jagmohan, Aditya Vempaty, Ravi Kokku · PDF
  7. Agentic Anomaly Detection for Shipping

    Alexander Timms, Abigail Langbridge, Fearghal O'Donncha · PDF
  8. Agents Thinking Fast and Slow: A Talker-Reasoner Architecture

    Konstantina Christakopoulou, Shibl Mourad, Maja Mataric · PDF
  9. AgentStudio: A Toolkit for Building General Virtual Agents

    Longtao Zheng, Zhiyuan Huang, Zhenghai Xue, Xinrun Wang, Bo An, Shuicheng YAN · PDF
  10. An Efficient Open World Benchmark for Multi-Agent Reinforcement Learning

    Eric Ye, Natasha Jaques · PDF
  11. Are Expressive Models Truly Necessary for Offline RL?

    Guan Wang, Haoyi Niu, Jianxiong Li, Li Jiang, Jianming HU, Xianyuan Zhan · PDF
  12. Articulated Animal AI: An Environment for Animal-like Cognition in a Limbed Agent

    Jeremy Lucas, Isabeau Prémont-Schwarz · PDF
  13. Automated Design of Agentic Systems

    Shengran Hu, Cong Lu, Jeff Clune · PDF
  14. Automating Thought of Search: A Journey Towards Soundness and Completeness

    Daniel Yiming Cao, Michael Katz, Harsha Kokel, Kavitha Srinivas, Shirin Sohrabi · PDF
  15. Can VLMs Play Action Role-Playing Games? Take Black Myth Wukong as a Study Case

    Peng Chen, Pi Bu, Jun Song, Yuan Gao, Bo Zheng · PDF
  16. CARD: Cross-modal Agent Framework for Generative and Editable Residential Design

    Pengyu Zeng, Maowei Jiang, Zihang Wang, Jizhizi Li, Jun Yin, Shuai Lu · PDF
  17. Chain-of-Imagination for Reliable Instruction Following in Decision Making

    Enshen Zhou, Yiran Qin, Zhenfei Yin, Yuzhou Huang, Ruimao Zhang, Lu Sheng, Yu Qiao, Jing Shao · PDF
  18. Cognitive Planning for Object Goal Navigation using Generative AI Models

    Arjun P S, Andrew Melnik, Gora Chand Nandi · PDF
  19. Collective Wisdom in Language Models: Harnessing LLM-Swarm for Agile Project Management

    Tahmid Hussain, Tashin Ahmed, Md Shahedul Haque, Mohammad Rifat Ahmmad Rashid · PDF
  20. CRAB: Cross-platfrom agent benchmark for multi-modal embodied language model agents

    Tianqi Xu, Linyao Chen, Dai-Jie Wu, Yanjun Chen, Zecheng Zhang, Xiang Yao, Zhiqiang Xie, Yongchao Chen, Shilong Liu, Bochen Qian, Philip Torr, Bernard Ghanem, Guohao Li · PDF
  21. Cradle: Empowering Foundation Agents towards General Computer Control

    Weihao Tan, Wentao Zhang, Xinrun Xu, Haochong Xia, Gang Ding, Boyu Li, Bohan Zhou, Junpeng Yue, Jiechuan Jiang, Yewen Li, Ruyi An, Molei Qin, Chuqiao Zong, Longtao Zheng, YuJie Wu, Xiaoqiang Chai, Yifei Bi, Tianbao Xie, Pengjie Gu, Xiyun Li, Ceyao Zhang, Long Tian, Chaojie Wang, Xinrun Wang, Börje F. Karlsson, Bo An, Shuicheng YAN, Zongqing Lu · PDF
  22. DARD: A Multi-Agent Approach for Task-Oriented Dialog Systems

    Aman Gupta, Anirudh Ravichandran, Ziji Zhang, Swair Shah, Anurag Beniwal, Narayanan Sadagopan · PDF
  23. DepsRAG: Towards Agentic Reasoning and Planning for Software Dependency Management

    Mohannad Alhanahnah, Yazan Boshmaf · PDF
  24. Dissecting Adversarial Robustness of Multimodal LM Agents

    Chen Henry Wu, Rishi Rajesh Shah, Jing Yu Koh, Russ Salakhutdinov, Daniel Fried, Aditi Raghunathan · PDF
  25. Do LLM Personas Dream of Bull Markets? Comparing Human and AI Investment Strategies Through the Lens of the Five-Factor Model

    Harris Borman, Anna Leontjeva, Luiz Pizzato, Max Kun Jiang, Dan Jermyn · PDF
  26. Efficient Reinforcement Learning via Large Language Model-based Search

    Siddhant Bhambri, Amrita Bhattacharjee, huan liu, Subbarao Kambhampati · PDF
  27. ENHANCING DATA EFFICIENCY IN REINFORCEMENT LEARNING: A NOVEL IMAGINATION MECHANISM BASED ON MESH INFORMATION PROPAGATION

    Zihang Wang, Maowei Jiang, Pengyu Zeng, Ruiqi Li, Quangao Liu, Peter Búš · PDF
  28. EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms

    Siyu Yuan, Kaitao Song, Jiangjie Chen, Xu Tan, Dongsheng Li, Deqing Yang · PDF
  29. FEABench: Evaluating Language Models on Real World Physics Reasoning Ability

    Nayantara Mudur, Hao Cui, Subhashini Venugopalan, Paul Raccuglia, Michael Brenner, Peter Christian Norgaard · PDF
  30. Fine-Tuning Web Agents: It Works, But It's Trickier Than You Think

    Massimo Caccia, Megh Thakkar, Léo Boisvert, Thibault Le Sellier de Chezelles, Alexandre Piché, Nicolas Chapados, Alexandre Drouin, Maxime Gasse, Alexandre Lacoste · PDF
  31. First-Explore, then Exploit: Meta-Learning to Solve Hard Exploration-Exploitation Trade-Offs

    Ben Norman, Jeff Clune · PDF
  32. FlickerFusion: Intra-trajectory Domain Generalizing Multi-Agent RL

    Woosung Koh, Wonbeen Oh, Siyeol Kim, Suhin Shin, Hyeongjin Kim, Jaein Jang, Junghyun Lee, Se-Young Yun · PDF
  33. FPGA-Gym: An FPGA-Accelerated Reinforcement Learning Environment Simulation Framework

    Jiayi Li, Hongxiao Zhao, Wenshuo Yue, Yihan Fu, Daijing Shi, Anjunyi Fan, Qinghao Wang, Yaodong Yang, Bonan Yan · PDF
  34. From Context to Action: Analysis of the Impact of State Representation and Context on the Generalization of Multi-Turn Web Navigation Agents

    Nalin Tiwary, Vardhan Dongre, Sanil Arun Chawla, Ashwin Lamani, Dilek Hakkani Tur · PDF
  35. Generalized Open-World Semi-Supervised Object Detection

    Garvita Allabadi, Ana Lucic, Siddarth Aananth, Tiffany Yang, Yu-Xiong Wang, Vikram S. Adve · PDF
  36. GTA: A Benchmark for General Tool Agents

    Jize Wang, Ma Zerun, Yining Li, Songyang Zhang, Cailian Chen, Kai Chen, Xinyi Le · PDF
  37. HSCL-RL: Mitigating Hallucinations in Multimodal Large Language Models

    Zichen Song, Sitan Huang · PDF
  38. Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models

    Logan Cross, Violet Xiang, Agam Bhatia, Daniel LK Yamins, Nick Haber · PDF
  39. IDEA: Enhancing the Rule Learning Ability of Language Agent through Induction, Deduction, and Abduction

    Kaiyu He, Mian Zhang, Shuo yan, Peilin Wu, Zhiyu Chen · PDF
  40. IDS-Agent: An LLM Agent for Explainable Intrusion Detection in IoT Networks

    Yanjie Li, Zhen Xiang, Nathaniel D. Bastian, Dawn Song, Bo Li · PDF
  41. Improving Decision-Making in Open-World Agents with Conformal Prediction and Monty Hall

    Harit Vishwakarma, Alan Mishler, Thomas Cook, Niccolo Dalmasso, Natraj Raman, Sumitra Ganesh · PDF
  42. In-Context Imitation Learning via Next-Token Prediction

    Letian Fu, Huang Huang, Gaurav Datta, Lawrence Yunliang Chen, William Chung-Ho Panitch, Fangchen Liu, Hui Li, Ken Goldberg · PDF
  43. Infer Human’s Intentions Before Following Natural Language Instructions

    Yanming Wan, Yue Wu, Yiping Wang, Jiayuan Mao, Natasha Jaques · PDF
  44. Infogent: An Agent-based Framework for Web Information Aggregation

    Revanth Gangi Reddy, Sagnik Mukherjee, Jeonghwan Kim, Zhenhailong Wang, Dilek Hakkani Tur, Heng Ji · PDF
  45. Integrating Visual and Linguistic Instructions for Context-Aware Navigation Agents

    Suhwan Choi, Yongjun Cho, Minchan Kim, Jaeyoon Jung, Myunchul Joe, Park Yu Been, Minseo Kim, Sungwoong Kim, Sungjae Lee, WHISEONG PARK, Jiwan Chung, Youngjae Yu · PDF
  46. Interactive Navigation of Quadruped Robots in Challenging Environments using Large Language Models

    Kangjie Zhou, Yao Mu, Pengying Wu, Han Gao, Chang Liu · PDF
  47. Inverse Attention Agent in Multi-Agent System

    Qian Long, Ruoyan Li, Minglu Zhao, Tao Gao, Demetri Terzopoulos · PDF
  48. Language Models and Symbolic Planners can Infer Action Semantics through Environment Feedback

    Wang Bill Zhu, Ishika Singh, Robin Jia, Jesse Thomason · PDF
  49. Learning Region-Word Alignment with Attentive Masking for Open-Vocabulary Object Detection

    Masoumeh Zareapoor, Pourya Shamsolmoali, Yue Lu · PDF
  50. Learning to Bridge the Gap: Efficient Novelty Recovery with Planning and Reinforcement Learning

    Alicia Li, Nishanth Kumar, Tomás Lozano-Pérez, Leslie Pack Kaelbling · PDF
  51. Lightweight Neural App Control

    Filippos Christianos, Georgios Papoudakis, Thomas Coste, Jianye HAO, Jun Wang, Kun Shao · PDF
  52. LLM2Swarm: Robot Swarms that Responsively Reason, Plan, and Collaborate through LLMs

    Volker Strobel, Marco Dorigo, Mario Fritz · PDF
  53. LLM4Drive: A Survey of Large Language Models for Autonomous Driving

    Zhenjie Yang, Xiaosong Jia, Hongyang Li, Junchi Yan · PDF
  54. LLMs Still Can't Plan; Can LRMs? A Preliminary Evaluation of OpenAI's o1 on PlanBench

    Karthik Valmeekam, Kaya Stechly, Subbarao Kambhampati · PDF
  55. MASAI: Modular Architecture for Software-engineering AI Agents

    Nalin Wadhwa, Atharv Sonwane, Daman Arora, Abhav Mehrotra, Saiteja Utpala, Ramakrishna B Bairi, Aditya Kanade, Nagarajan Natarajan · PDF
  56. MMCTAgent: Multi-modal Critical Thinking Agent Framework for Complex Visual Reasoning

    Somnath Sendhil Kumar, Yash Vinesh Gadhia, Tanuja Ganu, Akshay Nambi · PDF
  57. MobileFlow: A Multimodal LLM For Mobile GUI Agent

    Songqin Nong, Jiali Zhu, Rui Wu, Jiongchao Jin, Shuo Shan, Xiutian Huang, Wenhao Xu · PDF
  58. Multimodal Auto Validation For Self-Refinement in Web Agents

    Ruhana Azam, Tamer Abuelsaad, Aditya Vempaty, Ashish Jagmohan · PDF
  59. OASIS: Open Agents Social Interaction Simulations on One Million Agents

    Ziyi Yang, Zaibin Zhang, Zirui Zheng, Yuxian Jiang, Ziyue Gan, Zhiyu Wang, Zijian Ling, Konisberg, Martz Ma, Bowen Dong, Prateek Gupta, Shuyue Hu, Zhenfei Yin, Guohao Li, Xu Jia, Lijun Wang, Bernard Ghanem, Huchuan Lu, Wanli Ouyang, Yu Qiao, Philip Torr, Jing Shao · PDF
  60. One-shot World Models Using a Transformer Trained on a Synthetic Prior

    Fabio Ferreira, Moreno Schlageter, Raghu Rajan, André Biedenkapp, Frank Hutter · PDF
  61. Planning as Inpainting: A Generative Framework for Realistic Embodied Path Planning

    Cheng-Fu Yang, Haoyang Xu, Te-Lin Wu, Xiaofeng Gao, Kai-Wei Chang, Feng Gao · PDF
  62. Policy optimization to align the validity, coherence and efficiency of reasoning agents in multi-turn dialogues

    Jeremy Curuksu · PDF
  63. Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction Arena

    Jiangjie Chen, Siyu Yuan, Rong Ye, Bodhisattwa Prasad Majumder, Kyle Richardson · PDF
  64. Quality-Diversity Self-Play: Open-Ended Strategy Innovation via Foundation Models

    Aaron Dharna, Cong Lu, Jeff Clune · PDF
  65. RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents

    Tomoyuki Kagaya, Thong Jing Yuan, Yuxuan Lou, Jayashree Karlekar, Sugiri Pranata, Akira Kinose, Koki Oguri, Felix Wick, Yang You · PDF
  66. RAR-Agent: Retrieval Augmented Reflection Learning from Scratch for Reasoning

    Shipeng Xie, HAICHAO ZHU, Da Chen · PDF
  67. RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning and Verification in Long-Horizon Generation

    Zihao Wang, Anji Liu, Haowei Lin, Jiaqi Li, Xiaojian Ma, Yitao Liang · PDF
  68. RefactorBench: Evaluating Stateful Reasoning In Language Agents Through Code

    Dhruv Gautam, Spandan Garg, Jinu Jang, Neel Sundaresan, Roshanak Zilouchian Moghaddam · PDF
  69. REGENT: A Retrieval-Augmented Generalist Agent That Can Act In-Context in New Environments

    Kaustubh Sridhar, Souradeep Dutta, Dinesh Jayaraman, Insup Lee · PDF
  70. RH20T-P: A Primitive-Level Robotic Manipulation Dataset Towards Composable Generalization Agents in Real-world Scenarios

    Zeren Chen, Zhelun Shi, Xiaoya Lu, Lehan He, Sucheng Qian, Zhenfei Yin, Wanli Ouyang, Jing Shao, Yu Qiao, Cewu Lu, Lu Sheng · PDF
  71. Richelieu: Self-Evolving LLM-Based Agents for AI Diplomacy

    Zhenyu Guan, Xiangyu Kong, Fangwei Zhong, Yizhou Wang · PDF
  72. Robo-MUTUAL: Robotic Multimodal Task Specification via Unimodal Learning

    Jianxiong Li, Zhihao Wang, Jinliang Zheng, Xiaoai Zhou, Guanming Wang, Guanglu Song, Yu Liu, Jingjing Liu, Ya-Qin Zhang, Junzhi Yu, Xianyuan Zhan · PDF
  73. Robust Offline Learning via Adversarial World Models

    Uljad Berdica, Kelvin Li, Michael Beukman, Alexander David Goldie, Perla Maiolino, Jakob Nicolaus Foerster · PDF
  74. ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting

    Shaofei Cai, Zihao Wang, Kewei Lian, Zhancun Mu, Xiaojian Ma, Anji Liu, Yitao Liang · PDF
  75. Scaling Population-Based Reinforcement Learning with GPU Accelerated Simulation

    Asad Ali Shahid · PDF
  76. SEAL: Suite for Evaluating API-use of LLMs

    Woojeong Kim, Ashish Jagmohan, Aditya Vempaty · PDF
  77. SELFGOAL: Your Language Agents Already Know How to Achieve High-level Goals

    Ruihan Yang, Jiangjie Chen, Yikai Zhang, Siyu Yuan, Aili Chen, Kyle Richardson, Yanghua Xiao, Deqing Yang · PDF
  78. Semantically Safe Robot Manipulation: From Semantic Scene Understanding to Motion Safeguards

    Lukas Brunke, Yanni Zhang, Ralf Römer, Jack Naimer, Nikola Staykov, SiQi Zhou, Angela P. Schoellig · PDF
  79. ShowUI: One Vision-Language-Action Model for Generalist GUI Agent

    Kevin Qinghong Lin, Linjie Li, Difei Gao, Zhengyuan Yang, Zechen Bai, Weixian Lei, Lijuan Wang, Mike Zheng Shou · PDF
  80. Simulating User Agents for Embodied Conversational AI

    Daniel Philipov, Vardhan Dongre, gokhan tur, Dilek Hakkani Tur · PDF
  81. Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning

    Bryan Lincoln Marques de Oliveira, Bruno Brandão, Murilo Lopes da Luz, Luana Guedes Barros Martins, Telma Woerle de Lima Soares, Luckeciano Carvalho Melo · PDF
  82. SPA-BENCH: A COMPREHENSIVE BENCHMARK FOR SMARTPHONE AGENT EVALUATION

    Jingxuan Chen, Derek Yuen, Bin Xie, Yuhao Yang, Gongwei Chen, Zhihao Wu, Li Yixing, Xurui Zhou, Weiwen Liu, Shuai Wang, Rui Shao, Liqiang Nie, Yasheng Wang, Jianye HAO, Jun Wang, Kun Shao · PDF
  83. StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows

    Yiran Wu, Tianwei Yue, Shaokun Zhang, Chi Wang, Qingyun Wu · PDF
  84. The Impact of Element Ordering on LM Agent Performance

    Wayne Chi, Ameet Talwalkar, Chris Donahue · PDF
  85. Thermal and Energy Management with Fan Control Through Offline Meta-Reinforcement Learning

    Shao-Yu Yen, Yen Ru Lai, Fu-Chieh Chang, Pei-Yuan Wu · PDF
  86. Towards Automated Patent Workflows: AI-Orchestrated Multi-Agent Framework for Intellectual Property Management and Analysis

    Sagar Srinivas Sakhinana, Vijay sri vaikunth, Venkataramana Runkana · PDF
  87. Towards Autonomous Agents: Adaptive-planning, Reasoning, and Acting in Language Models

    Abhishek Dutta, Yen-Che Hsiao · PDF
  88. Towards Humanoid: Value-Driven Agent Modeling Based on Large Language Models

    Xuzheng Chen, Zhangshiyin, Guojie Song · PDF
  89. Towards Principled Representation Learning from Videos for Reinforcement Learning

    Dipendra Misra, Akanksha Saran, Tengyang Xie, Alex Lamb, John Langford · PDF
  90. Towards Robust Estimation of Human Intention Hierarchy in Robot Teleoperation

    Nikki Lijing Kuang, Songpo Li, Soshi Iba · PDF
  91. Variational Inequality Perspective and Optimizers for Multi-Agent Reinforcement Learning

    Baraah A. M. Sidahmed, Tatjana Chavdarova · PDF
  92. VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks

    Lawrence Keunho Jang, Yinheng Li, Charles Ding, Justin Lin, Paul Pu Liang, Dan Zhao, Rogerio Bonatti, Kazuhito Koishida · PDF
  93. What Do You Mean by "Open World"?

    Bowen Xu · PDF
  94. Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale

    Rogerio Bonatti, Dan Zhao, Dillon Dupont, Sara Abdali, Yinheng Li, Yadong Lu, Justin Wagle, Kazuhito Koishida, Arthur Bucker, Lawrence Keunho Jang, Zheng Hui · PDF
  95. Words as Beacons: Guiding RL Agents with High-Level Language Prompts

    Unai Ruiz-Gonzalez, Alain Andres, Pedro G. Bascoy, Javier Del Ser · PDF
  96. xTED: Cross-Domain Adaptation via Diffusion-Based Trajectory Editing

    Haoyi Niu, Qimao Chen, Tenglong Liu, Jianxiong Li, Guyue Zhou, Yi ZHANG, Jianming HU, Xianyuan Zhan · PDF
  97. Zero-shot Whole-Body Humanoid Control via Behavioral Foundation Models

    Andrea Tirinzoni, Ahmed Touati, Jesse Farebrother, Mateusz Guzek, Anssi Kanervisto, Yingchen Xu, Alessandro Lazaric, Matteo Pirotta · PDF