CVPR 2026 Past Computer visionMultimodalEducation
Computer Vision × Education: Building a Cross-Community Agenda for Multimodal Vision in Classrooms
CV4Edu
- Submission deadline
-
TBA — know
the deadline? Add it in one line The file opens with a ready-to-fill template — takes about a minute.
- Submission portal
- OpenReview
- Notes
- Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).
Accepted papers (18)
Fetched from OpenReview (v2) on 2026-06-10.
-
[UNI]101: An Educational Dataset for Introductory Computer Vision
-
AI-Assisted Competency Assessment from Egocentric Video in Simulation-Based Nursing Education
-
ConfusionBench: An Expert-Validated Benchmark for Confusion Recognition and Localization in Educational Videos
-
Context Matters: Peer-Aware Student Behavioral Engagement Measurement via VLM Action Parsing and LLM Sequence Classification
-
Cross-modal Affinity-aligned Multimodal Learning Analytics for Predicting Student Collaboration Satisfaction in Game-Based Learning
-
Delta-Gated Incremental Multi-Forward-Pass Modeling for Robust Multimodal Classroom Video Understanding
-
Diagnosis of Human–Object Interaction Detectors for Real-World Educational Applications
-
Do Emotion Recognition Models Generalize to Classrooms? Robustness and Fairness Analysis
-
Evaluating Web-trained Facial Expression Recognition in Naturalistic Collaborative Learning
-
From Emotion Recognition to Mind-Wandering Detection: A Comparative Analysis of Video-Based Emotion Foundation Models
-
InterventionLens: A Multi-Agent Framework for Detecting ASD Intervention Strategies in Parent-Child Shared Reading
-
MES-Bench: A Benchmark for Multimodal Elaborative Simplification and Comprehensibility Evaluation in Language Learning
-
Negative Evidence in the Classroom: Learning From What Vision Cannot Reliably See
-
ReSoFed: Reliability-Guided Model Souping for Robust Federated Learning in Heterogeneous Classroom Environments
-
Scaffolding Human Learning by Shaping Visual Environment
-
Sequence-Based Identification of First-Person Camera Wearers in Third-Person Views
-
Speech-Synchronized Whiteboard Generation via VLM-Driven Structured Drawing Representations
-
VLMath: A Multimodal Vision-Language System for Pedagogically Aligned Math Tutoring