ICLR 2026 Past Math & reasoningEfficiency

The First Workshop on Efficient Spatial Reasoning

ES-Reasoning @ ICLR 2026

Submission deadline
Feb 13, 2026, 11:59 UTC
imported from OpenReview — check the website for extensions
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (48)

Fetched from OpenReview (v2) on 2026-06-10.

  1. An Analysis of Reasoning Length Scaling and Positional Effects in Vision Language Models for Spatial Reasoning

    Hakan Muluk · PDF
  2. Anytime Safe PAC Efficient Reasoning

    Chengyao Yu, Hao Zeng, Youxin Zhu, Jianguo Huang, Huajun Zeng, Bingyi Jing · PDF
  3. Bio-Inspired Spatial Reasoning Transformer: Grid Cells, Place Cells, and Attractor Dynamics for Text-Based Spatial Understanding

    Hyunjun Kim · PDF
  4. CivicEmbed: Feature-specific embeddings for efficient geographic reasoning and retrieval

    Josephine Wang, Julien Coquet, Jeffrey Huang · PDF
  5. Demystifying Action Space Design for Robotic Manipulation Policies

    Yuchun Feng, Jinliang Zheng, Zhihao Wang, Dongxiu Liu, Jianxiong Li, Jiangmiao Pang, Tai Wang, Xianyuan Zhan · PDF
  6. DREAM-R: Multimodal Speculative Reasoning with RL-Based Refined Drafting, Precise Verification, and Fully Parallel Execution

    Yunhai Hu, Zining Liu, Xiangyang Yin, Tianhua Xia, BO BAO, Eric Sather, Vithursan Thangarasa, Sai Qian Zhang · PDF
  7. EarthSpatialBench: Benchmarking Spatial Reasoning Capabilities of Multimodal LLMs on Earth Imagery

    Zelin Xu, Yupu Zhang, Saugat Adhikari, Saiful Islam, Tingsong Xiao, Zibo Liu, Shigang Chen, Da Yan, Zhe Jiang · PDF
  8. Efficient Dense Features With BRIXEL

    Alexander Lappe, Martin A. Giese · PDF
  9. Embedding Morphology into Transformers for Cross-Robot Policy Learning

    Kei Suzuki, Jing Liu, Ye Wang, Chiori Hori, Matthew Brand, Diego Romeres, Toshiaki Koike-Akino · PDF
  10. ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction

    Qineng Wang, Wenlong Huang, Yu Zhou, Hang Yin, Tianwei Bao, Jianwen Lyu, Weiyu Liu, Ruohan Zhang, Jiajun Wu, Li Fei-Fei, Manling Li · PDF
  11. Enhancing Aerial Vision-Language Navigation with Map Grounding and History Awareness

    Hakob Tamazyan, Narek Nurijanyan, Boris Martirosyan, Hrant Khachatrian · PDF
  12. Evaluating VLMs' Spatial Reasoning Over Robot Motion: A Step Towards Robot Planning with Motion Preferences

    Wenxi Wu, Jingjing Zhang, Martim Brandao · PDF
  13. Explicit 3D Spatial Reasoning via Program Generation

    Zhanpeng Luo, Ce Zhang, Silong Yong, Cunxi Dai, Qianwei Wang, Haoxi Ran, Guanya Shi, Katia P. Sycara, Yaqi Xie · PDF
  14. FlashDrive: Flash Vision-Language-Action Inference for Autonomous Driving

    Zekai Li, Yihao Liang, Hongfei Zhang, Jian Chen, Zhijian Liu · PDF
  15. FROM STEERING TO PEDALLING: DO AUTONOMOUS DRIVING VLMS GENERALIZE TO CYCLIST-ASSISTIVE SPATIAL PERCEPTION AND PLANNING?

    Krishna Kanth Nakka, Vedasri Nakka · PDF
  16. FROST: Filtering Reasoning Outliers with Attention for Efficient Reasoning

    Haozheng Luo, Zhuolin Jiang, Md Zahid Hasan, Yan Chen, Soumalya Sarkar · PDF
  17. Geometry-aware 4D Video Generation for Robot Manipulation

    Zeyi Liu, Shuang Li, Eric Cousineau, Siyuan Feng, Benjamin Burchfiel, Shuran Song · PDF
  18. GRAID: Enhancing Spatial Reasoning of VLMs through High-Fidelity Data Generation

    Karim Elmaaroufi, Liheng Lai, Justin Svegliato, Yutong Bai, Sanjit A. Seshia, Matei Zaharia · PDF
  19. HiResNets: Native Full-HD Video Recognition with Foveal Residual Streams

    Shivani Mall, Swarnim Jain, Joao F. Henriques · PDF
  20. Improving GUI Grounding with Explicit Position-to-Coordinate Mapping

    Suyuchen Wang, Tianyu Zhang, Ahmed Masry, Christopher Pal, Spandana Gella, Bang Liu, Perouz Taslakian · PDF
  21. LEO-VL: Efficient Scene Representation for Scalable 3D Vision-Language Learning

    Jiangyong Huang, Xiaojian Ma, Xiongkun Linghu, Junchao He, Qing Li, Song-Chun Zhu, Yixin Chen, Baoxiong Jia, Siyuan Huang · PDF
  22. LocationReasoner: Evaluating LLMs on Real-World Site Selection Reasoning

    Miho Koda, Yu Zheng, Ruixian Ma, Mingyang Sun, Devesh Pansare, Fabio Duarte, Paolo Santi · PDF
  23. MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse

    Zhenyu Pan, Han Liu · PDF
  24. Multimodal Language Models Cannot Spot Spatial Inconsistencies

    Om Khangaonkar, Hadi J. Rad, Hamed Pirsiavash · PDF
  25. Omni-View: Unlocking How Generation Facilitates Understanding in Unified 3D Model based on Multiview images

    JiaKui Hu, Shanshan Zhao, Qing-Guo Chen, Xuerui Qiu, Jialun Liu, Zhao Xu, Weihua Luo, Kaifu Zhang, Yanye Lu · PDF
  26. On the Provable Performance Guarantee of Efficient Reasoning Models

    Hao Zeng, Jianguo Huang, Bingyi Jing, Hongxin Wei, Bo An · PDF
  27. Orion: A Fully Deterministic and Interpretable Pipeline for Video Scene Graph Generation with Explicit Causal Influence Scoring

    Riddhiman Rana, Aryav Semwal, Yogesh Atluru, Shivank Garg, Cristian Meo, Kevin Zhu · PDF
  28. PhyRPR: Training-Free Physics-Constrained Video Generation

    Yibo Zhao, Hengjia Li, Xiaofei He, Boxi Wu · PDF
  29. PhysNote: Self-Knowledge Notes for Evolvable Physical Reasoning in Vision-Language Model

    Sinin Zhang, Yunfei Xie, Yuxuan Cheng, Haoyu Zhang, Tong Zhang · PDF
  30. PointWorld: Scaling 3D World Models for In-The-Wild Robotic Manipulation

    Wenlong Huang, Yu-Wei Chao, Arsalan Mousavian, Ming-Yu Liu, Dieter Fox, Kaichun Mo, Li Fei-Fei · PDF
  31. Probing Perceptual Constancy in Large Vision-Language Models

    Haoran Sun, Bingyang Wang, Suyang Yu, Yijiang Li, Qingying Gao, Haiyun Lyu, Lianyu Huang, Zelong Hong, Jiahui Ge, Qianli Ma, Hang He, Yifan Zhou, Lingzi Guo, Lantao Mei, Maijunxian Wang, Dezhi Luo, Hokin Deng · PDF
  32. Probing Visual Planning in Image Editing Models

    Zhimu Zhou, Yanpeng Zhao, Qiuyu Liao, Bo Zhao, Xiaojian Ma · PDF
  33. Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision–Language–Action Models via Latent Iterative Reasoning

    Yalcin Tur, Jalal Naghiyev, Haoquan Fang, Wei-Chuan Tsai, Jiafei Duan, Dieter Fox, Ranjay Krishna · PDF
  34. REMAP: Evaluating Geometric Dual Representations in Multi-view Spatial Reasoning

    Selina Cheng, Anne Wu, Eunice Yiu, Yoav Artzi · PDF
  35. ReSpace: Text-Driven Autoregressive 3D Indoor Scene Synthesis and Editing

    Martin JJ. Bucher, Iro Armeni · PDF
  36. RoboSafe: Safeguarding Embodied Agents via Executable Safety Logic

    Le Wang, Zonghao Ying, Xiao yang, Quanchen Zou, Zhenfei Yin, Tianlin Li, Jian Yang, Yaodong Yang, Lu Sheng, Aishan Liu, Xianglong Liu · PDF
  37. SCOPE: Spatially-Constrained Parametric Editing for Text-Guided CAD Models

    Md Zahid Hasan, Soumalya Sarkar · PDF
  38. Seeing Once is Enough? Online Geometry-Aware Token Pruning for 3D Question Answering

    Ruei-Chi Lai, Bolivar Enrique Solarte, Chin-Hsuan Wu, Yi-Hsuan Tsai, Min Sun · PDF
  39. Solving Spatial Supersensing Without Spatial Supersensing

    Vishaal Udandarao, Shyamgopal Karthik, Surabhi S Nath, Andreas Hochlehnert, Matthias Bethge, Ameya Prabhu · PDF
  40. Spatial Competence Benchmark

    Jash Vira, Ashley Harris · PDF
  41. SpatialTree : How Spatial Abilities Branch Out in MLLMs

    Yuxi Xiao, Longfei Li, Shen Yan, Xinhang Liu, Sida Peng, Yunchao Wei, Xiaowei Zhou, Bingyi Kang · PDF
  42. Structural Graph Probing of Vision–Language Models

    Haoyu He, Yue Zhuo, Yu Zheng, Qi R. Wang · PDF
  43. SVQA-R1: Reinforcing Spatial Reasoning in MLLMs via View-Consistent Reward Optimization

    Peiyao Wang, Haibin Ling · PDF
  44. The Dual Mechanisms of Spatial Reasoning in Vision–Language Models

    Kelly Cui, Nikhil Prakash, Ayush Raina, David Bau, Antonio Torralba, Tamar Rott Shaham · PDF
  45. TIDES: Test-time Inference Drift Exploitation via Scaling

    Haoran Dai, Haozheng Luo, Haotian Zhang, Meng lin, Yan Chen, Binghui Wang · PDF
  46. VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents

    Zirui Wang, Junyi Zhang, Jiaxin Ge, Long Lian, Letian Fu, Lisa Dunlap, Ken Goldberg, XuDong Wang, Ion Stoica, David M. Chan, Sewon Min, Joseph E. Gonzalez · PDF
  47. VisualThinker: First ever R1-Zero's Aha Moment on just a 2B non-SFT Model

    Hengguang Zhou, Xirui Li, Ruochen Wang, Minhao Cheng, Tianyi Zhou, Cho-Jui Hsieh · PDF
  48. ViTaB-A: Evaluating Multimodal Large Language Models on Visual Table Attribution

    Yahia Alqurnawi, Preetom Biswas, Anmol Rao, Tejas Anvekar, Chitta Baral, Vivek Gupta · PDF