CVPR 2026 Past Computer vision

The 2nd Workshop on Test-time Scaling for Computer Vision

2nd ViSCALE @ CVPR 2026

Submission deadline
TBA — know the deadline? Add it in one line
The file opens with a ready-to-fill template — takes about a minute.
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (18)

Fetched from OpenReview (v2) on 2026-06-10.

  1. [EXTENDED ABSTRACT] Vero: An Open RL Recipe for General Visual Reasoning

    Gabriel Herbert Sarch, Linrong Cai, Qunzhong Wang, Haoyang Wu, Danqi Chen, Zhuang Liu · PDF
  2. ATHENA: Adaptive Test-Time Steering for Improving Count Fidelity in Diffusion Models

    Mohammad Shahab Sepehri, Asal Mehradfar, Berk Tinaz, Salman Avestimehr, Mahdi Soltanolkotabi · PDF
  3. Attention Budget Scheduling: Token-Level Test-Time Scaling for Vision Transformers

    Mahule Roy, Subhas Roy · PDF
  4. EmbodiedEval: Evaluate Multimodal LLMs as Embodied Agents

    Zhili Cheng, Ran Li, Jinyi Hu, Yuge Tu, Shiqi Dai, Shengding Hu, Yang Shi, Lei Shi, Maosong Sun · PDF
  5. EXTENDED ABSTRACT -- World2Mind: Cognition Toolkit for Allocentric Spatial Reasoning in Foundation Models

    Shouwei Ruan, Bin Wang, Zhenyu Wu, Qihui Zhu, Yuxiang Zhang, Hang Su, Yubin Wang · PDF
  6. EXTENDED ABSTRACT: Learning to Think Fast and Slow for Visual Language Models

    Chenyu Lin, Cheng Chi, Jinlin Wu, Sharon Li, Kaiyang Zhou · PDF
  7. EXTENDED ABSTRACT: Scaling Test-Time Compute via Semantic Critique and Spectral Alignment for Visual Media Generation

    Jia Xian Huang · PDF
  8. IMA & TMA: Efficient Test-Time Adaptation for VLMs via Linear Transformation in Embedding Space

    Rishik Vamshi Rohith Vempati, Eswar Venkata Sai Kadava, Konda Reddy Mopuri · PDF
  9. Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs

    Yining Hong, Huang Huang, Manling Li, Li Fei-Fei, Jiajun Wu, Yejin Choi · PDF
  10. MetaWorld: Skill Transfer and Composition in a Hierarchical World Model for Grounding High-Level Instructions

    Yutong Shen, Hangxu Liu, Kailin Pei, Yinqi Liu, Ruizhe Xia, Tongtong Feng · PDF
  11. Mind over Space: Can Multimodal Large Language Models Mentally Navigate?

    Qihui Zhu, Shouwei Ruan, Xiao Yang, Hao Jiang, Yao Huang, Shiji Zhao, Hanwei Fan, Hang Su, Xingxing Wei · PDF
  12. Predictive Spectral Calibration for Source-Free Test-Time Regression

    Tuan Kiet Nguyen Viet, Thanh Trung Huynh, Hieu Pham · PDF
  13. ProFuse: Efficient Open-Vocabulary 3D Gaussian Splatting with Early-Saturating Semantic Uplifting

    Yen-Jen Chiou · PDF
  14. Rethinking Dense Optical Flow without Test-Time Scaling

    Praroop Chanda, Suryansh Kumar · PDF
  15. SA-TTS: Stress-Aware Test-Time Scaling for Vision Models

    youla yang · PDF
  16. ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

    Jiawei Gu, Yunzhuo Hao, Huichen Will Wang, Linjie Li, Michael Qizhe Shieh, Yejin Choi, Ranjay Krishna, Yu Cheng · PDF
  17. TreeReasoner: Reinforcing Tool-Augmented Tree-of-Videos Reasoning

    Hongcheng Gao, Jingyi Tang, Zihao Huang, Liang Li, Li Su, Qingming Huang · PDF
  18. Understanding the Limits of Vision Test-Time Scaling: Path Redundancy, Instance Difficulty, and Adaptive Compute

    youla yang · PDF