CVPR 2026 Past Other

Third Workshop on Visual Concepts

VisCon 2026

Submission deadline
Apr 2, 2026, 07:59 UTC
imported from OpenReview — check the website for extensions
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (27)

Fetched from OpenReview (v2) on 2026-06-10.

  1. A Taxonomy-Aware Evaluation for Open-Vocabulary Wildlife Detection

    Wenqi Xue, Pengxi Zhang, William Wang, Yijia Cai, Jieyu Zhang · PDF
  2. CGEBench: Benchmarking Concept Generalization of Promptable Image Segmentation Models

    Alexander von Recum, Christoph Schnabl · PDF
  3. ConceptOT: Fine-Grained Vision-Language Alignment via Low-Rank Unbalanced Optimal Transport

    Pawan Kumar · PDF
  4. CTRL-STEER: Closed-Loop Neuron Activation Control in Vision-Language-Action Models

    Abhijith Babu, Ramneet Kaur, Nathaniel D. Bastian, Olivera Kotevska, Susmit Jha, Yanzhao Wu, Sumit Kumar Jha, Anirban Roy · PDF
  5. Dissecting Representation Structure in Vision Transformers: A Rigorous Architectural Study

    Kim-Cuc Nguyen, Ngai-Man Cheung · PDF
  6. Do VLMs Reason About Faces? Probing the Perception-Reasoning Gap in Identity Judgment

    Mahsa Khoshnoodi, Sarah Adel Bargal · PDF
  7. Entropy-based Patchification Creates Semantic Tokens

    Suhao Yu, Jingjia Peng, Yao Tang, Jiatao Gu · PDF
  8. Forecasting Animal Motion in the Wild

    Neerja Thakkar, Shiry Ginosar, Jacob C Walker, Jitendra Malik, Joao Carreira, Carl Doersch · PDF
  9. From Comparison to Composition: Towards Understanding Machine Cognition of Unseen Categories

    Minghao Fu, Sheng Zhang, Guangyi Chen, Zijian Li, Fan Feng, Yifan Shen, Shaoan Xie, Heng Huang, Kun Zhang · PDF
  10. Hidden Clones: Exposing and Fixing Family Bias in Vision-Language Model Ensembles

    Zacharie Bugaud · PDF
  11. Improved Vision-Language Alignment via Text-Conditioned Image Embeddings using Sparse Autoencoders

    Sweta Mahajan, Sukrut Rao, Jiahao Xie, Alexander Koller, Bernt Schiele · PDF
  12. INSID3: Training-Free In-Context Segmentation with DINOv3

    Claudia Cuttano, Gabriele Trivigno, Christoph Reich, Daniel Cremers, Carlo Masone, Stefan Roth · PDF
  13. LandCIS: Hierarchical Semantic Anchoring for Concept-Centric Continual Segmentation

    Yuyin Ma, Yijian wu, Wang Xinyu, Yijun Lu, Zhen Tian, Ming YAN, Yunni Xia · PDF
  14. Learning Sparse Visual Representations via Spatial-Semantic Factorization

    Theodore Zhengde Zhao, Sid Kiblawi, Jianwei Yang, Naoto Usuyama, Reuben Tan, Noel C Codella, Tristan Naumann, Hoifung Poon, Mu Wei · PDF
  15. MCSBench: Probing Multimodal Conceptual Structure of Multimodal LLMs

    Sheng Zhang, Minghao Fu, Kelin Yu, Tong Zheng, Guangyi Chen, Hong Jiao, Salman Khan, Zhiqiang Shen, Heng Huang · PDF
  16. Most of This Video Is Boring

    Anya Singh, Jiahang He, Varun Nair, Jai Relan, Vidyut Baradwaj, Cabrel Happi · PDF
  17. Multi-hop Relational Contrastive Learning: Extending Spatial Contrastive Pre-training Beyond Pairwise Relations

    Sheikh Tanvir Ahmed, Md. Tanvir Raihan · PDF
  18. Seeing Only What Exists: Visibility-Aware Contrastive Learning for Concept-Level Hallucination in Vision–Language Models

    Hikaru Shijo, Yutaka Yoshihama, Yasunori Ishii, Takayoshi Yamashita · PDF
  19. Self-Consistency for LLM-Based Motion Trajectory Generation and Verification

    Jiaju Ma, R. Kenny Jones, Jiajun Wu, Maneesh Agrawala · PDF
  20. Semantic Concept Conditioning for State Space Image Super-Resolution

    Andrii Ahitoliev, Bohdan Milian, Oleh Shtohryn, Anna-Alina Bondarets, Alina Labaz, Taras Rumezhak, Volodymyr Karpiv · PDF
  21. SPOT: Structured Prompting with Object-centric Tokens for open-world scene graphs

    Mengqi Zhang, Sahil Khose, Fiona Ryan, Judy Hoffman · PDF
  22. Test-Time Visual Concept Anchoring via Entropic Optimal Transport

    Pawan Kumar · PDF
  23. Toward Compact and Structured Visual Representations in VLMs: SSM-Based Vision Encoders as an Alternative to Transformers

    Shang-Jui Ray Kuo, Paola Cascante-Bonilla · PDF
  24. Uncertainty-guided Compositional Alignment with Part-to-Whole Semantic Representativeness in Hyperbolic Vision-Language Models

    Hayeon Kim, Ji Ha Jang, Junghun James Kim, Se Young Chun · PDF
  25. VCode: A Multimodal Coding Benchmark with SVG as Symbolic Visual Representation

    Kevin Qinghong Lin, Yuhao Zheng, Hangyu Ran, Dongxing Mao, Linjie Li, Philip Torr, Alex Jinpeng Wang · PDF
  26. VisAnalog: A Diagnostic Suite for Visual Concept Transfer on Natural Images

    Zhaonan Li, Kyle R. Chickering, Bangzheng Li, Jacob Dineen, Xiao Ye, Zhikun Xu, Shijie Lu, Yuxi Huang, Ming Shen, Bach Nguyen, Jaya Adithya Pavuluri, Mau Son Nguyen, Sanika Chavan, Ngoc Minh Thu Le, Muhao Chen, Ben Zhou · PDF
  27. WristCompass: Kinematic Coupling as a Learnable Visual Concept for Ego-Camera Orientation

    Varun Nair, Vidyut Baradwaj, Jiahang He, Anya Singh, Jai Relan, Cabrel Happi · PDF