CVPR 2026 Past Large language modelsComputer vision
CVPR 2026 Video LLMs Workshop
VidLLMs 2026
- Submission deadline
-
TBA — know
the deadline? Add it in one line The file opens with a ready-to-fill template — takes about a minute.
- Submission portal
- OpenReview
- Notes
- Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).
Accepted papers (18)
Fetched from OpenReview (v2) on 2026-06-10.
-
4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
-
CausalScene: Typed Causal Scene Graphs for Counterfactual Physical Reasoning with a Path to Video LLMs
-
CoSeLECT: Adaptive Frame Selection for Video-Language Understanding
-
Evaluating Video Question Answering Multimodal Large Language Models
-
FineBench: Benchmarking and Enhancing Vision-Language Models for Fine-grained Human Activity Understanding
-
Grounding Video Reasoning in Physical Signals
-
Hidden Clones: Exposing and Fixing Family Bias in Vision-Language Model Ensembles
-
MAVEN: A Multi-stage Agentic Annotation Pipeline for Video Reasoning Tasks
-
Mind the Gap: Dataset and Fine-grained Evaluation for Inline Audio Descriptions
-
One Identity, Many Roles: Multimodal Entity Coreference for Enhanced Video Situation Recognition
-
StreamMem: Query-Agnostic KV Cache Memory for Streaming Video Understanding
-
StreamReady: Learning *What* to Answer and *When* in Long Streaming Videos
-
Test-Time Horizon Scaling in Video LLMs via Adaptive Temporal Memory Compression
-
TimeBlind: A Spatio-Temporal Compositionality Benchmark for Video LLMs
-
VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large Video Language Models
-
VideoCritic: Diagnosing and Localizing Reasoning Errors in Video-Language Models
-
VideoNet: A Large-Scale Dataset for Domain-Specific Action Recognition
-
VisCoP: Visual Probing for Video Domain Adaptation of Vision Language Models