NeurIPS 2024 Past Agents
NeurIPS 2024 Workshop on Open-World Agents
NeurIPS 2024 Workshop Open-World Agents
- Submission deadline
- Sep 21, 2024, 00:01 UTC imported from OpenReview — check the website for extensions
- Submission portal
- OpenReview
- Notes
- Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).
Accepted papers (97)
Fetched from OpenReview (v2) on 2026-06-10.
-
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
-
A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks
-
A Simplified A Priori Theory Of Meaning, –Nature based AI ‘first principles’–
-
Advancing Agentic Systems: Dynamic Task Decomposition, Tool Integration and Evaluation using Novel Metrics and Dataset
-
Agent S: An Open Agentic Framework that Uses Computers Like a Human
-
Agent-E: From Autonomous Web Navigation to Foundational Design Principles in Agentic Systems
-
Agentic Anomaly Detection for Shipping
-
Agents Thinking Fast and Slow: A Talker-Reasoner Architecture
-
AgentStudio: A Toolkit for Building General Virtual Agents
-
An Efficient Open World Benchmark for Multi-Agent Reinforcement Learning
-
Are Expressive Models Truly Necessary for Offline RL?
-
Articulated Animal AI: An Environment for Animal-like Cognition in a Limbed Agent
-
Automated Design of Agentic Systems
-
Automating Thought of Search: A Journey Towards Soundness and Completeness
-
Can VLMs Play Action Role-Playing Games? Take Black Myth Wukong as a Study Case
-
CARD: Cross-modal Agent Framework for Generative and Editable Residential Design
-
Chain-of-Imagination for Reliable Instruction Following in Decision Making
-
Cognitive Planning for Object Goal Navigation using Generative AI Models
-
Collective Wisdom in Language Models: Harnessing LLM-Swarm for Agile Project Management
-
CRAB: Cross-platfrom agent benchmark for multi-modal embodied language model agents
-
Cradle: Empowering Foundation Agents towards General Computer Control
-
DARD: A Multi-Agent Approach for Task-Oriented Dialog Systems
-
DepsRAG: Towards Agentic Reasoning and Planning for Software Dependency Management
-
Dissecting Adversarial Robustness of Multimodal LM Agents
-
Do LLM Personas Dream of Bull Markets? Comparing Human and AI Investment Strategies Through the Lens of the Five-Factor Model
-
Efficient Reinforcement Learning via Large Language Model-based Search
-
ENHANCING DATA EFFICIENCY IN REINFORCEMENT LEARNING: A NOVEL IMAGINATION MECHANISM BASED ON MESH INFORMATION PROPAGATION
-
EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms
-
FEABench: Evaluating Language Models on Real World Physics Reasoning Ability
-
Fine-Tuning Web Agents: It Works, But It's Trickier Than You Think
-
First-Explore, then Exploit: Meta-Learning to Solve Hard Exploration-Exploitation Trade-Offs
-
FlickerFusion: Intra-trajectory Domain Generalizing Multi-Agent RL
-
FPGA-Gym: An FPGA-Accelerated Reinforcement Learning Environment Simulation Framework
-
From Context to Action: Analysis of the Impact of State Representation and Context on the Generalization of Multi-Turn Web Navigation Agents
-
Generalized Open-World Semi-Supervised Object Detection
-
GTA: A Benchmark for General Tool Agents
-
HSCL-RL: Mitigating Hallucinations in Multimodal Large Language Models
-
Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models
-
IDEA: Enhancing the Rule Learning Ability of Language Agent through Induction, Deduction, and Abduction
-
IDS-Agent: An LLM Agent for Explainable Intrusion Detection in IoT Networks
-
Improving Decision-Making in Open-World Agents with Conformal Prediction and Monty Hall
-
In-Context Imitation Learning via Next-Token Prediction
-
Infer Human’s Intentions Before Following Natural Language Instructions
-
Infogent: An Agent-based Framework for Web Information Aggregation
-
Integrating Visual and Linguistic Instructions for Context-Aware Navigation Agents
-
Interactive Navigation of Quadruped Robots in Challenging Environments using Large Language Models
-
Inverse Attention Agent in Multi-Agent System
-
Language Models and Symbolic Planners can Infer Action Semantics through Environment Feedback
-
Learning Region-Word Alignment with Attentive Masking for Open-Vocabulary Object Detection
-
Learning to Bridge the Gap: Efficient Novelty Recovery with Planning and Reinforcement Learning
-
Lightweight Neural App Control
-
LLM2Swarm: Robot Swarms that Responsively Reason, Plan, and Collaborate through LLMs
-
LLM4Drive: A Survey of Large Language Models for Autonomous Driving
-
LLMs Still Can't Plan; Can LRMs? A Preliminary Evaluation of OpenAI's o1 on PlanBench
-
MASAI: Modular Architecture for Software-engineering AI Agents
-
MMCTAgent: Multi-modal Critical Thinking Agent Framework for Complex Visual Reasoning
-
MobileFlow: A Multimodal LLM For Mobile GUI Agent
-
Multimodal Auto Validation For Self-Refinement in Web Agents
-
OASIS: Open Agents Social Interaction Simulations on One Million Agents
-
One-shot World Models Using a Transformer Trained on a Synthetic Prior
-
Planning as Inpainting: A Generative Framework for Realistic Embodied Path Planning
-
Policy optimization to align the validity, coherence and efficiency of reasoning agents in multi-turn dialogues
-
Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction Arena
-
Quality-Diversity Self-Play: Open-Ended Strategy Innovation via Foundation Models
-
RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents
-
RAR-Agent: Retrieval Augmented Reflection Learning from Scratch for Reasoning
-
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning and Verification in Long-Horizon Generation
-
RefactorBench: Evaluating Stateful Reasoning In Language Agents Through Code
-
REGENT: A Retrieval-Augmented Generalist Agent That Can Act In-Context in New Environments
-
RH20T-P: A Primitive-Level Robotic Manipulation Dataset Towards Composable Generalization Agents in Real-world Scenarios
-
Richelieu: Self-Evolving LLM-Based Agents for AI Diplomacy
-
Robo-MUTUAL: Robotic Multimodal Task Specification via Unimodal Learning
-
Robust Offline Learning via Adversarial World Models
-
ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting
-
Scaling Population-Based Reinforcement Learning with GPU Accelerated Simulation
-
SEAL: Suite for Evaluating API-use of LLMs
-
SELFGOAL: Your Language Agents Already Know How to Achieve High-level Goals
-
Semantically Safe Robot Manipulation: From Semantic Scene Understanding to Motion Safeguards
-
ShowUI: One Vision-Language-Action Model for Generalist GUI Agent
-
Simulating User Agents for Embodied Conversational AI
-
Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning
-
SPA-BENCH: A COMPREHENSIVE BENCHMARK FOR SMARTPHONE AGENT EVALUATION
-
StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows
-
The Impact of Element Ordering on LM Agent Performance
-
Thermal and Energy Management with Fan Control Through Offline Meta-Reinforcement Learning
-
Towards Automated Patent Workflows: AI-Orchestrated Multi-Agent Framework for Intellectual Property Management and Analysis
-
Towards Autonomous Agents: Adaptive-planning, Reasoning, and Acting in Language Models
-
Towards Humanoid: Value-Driven Agent Modeling Based on Large Language Models
-
Towards Principled Representation Learning from Videos for Reinforcement Learning
-
Towards Robust Estimation of Human Intention Hierarchy in Robot Teleoperation
-
Variational Inequality Perspective and Optimizers for Multi-Agent Reinforcement Learning
-
VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks
-
What Do You Mean by "Open World"?
-
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale
-
Words as Beacons: Guiding RL Agents with High-Level Language Prompts
-
xTED: Cross-Domain Adaptation via Diffusion-Based Trajectory Editing
-
Zero-shot Whole-Body Humanoid Control via Behavioral Foundation Models