COLM 2025 Past Other
The First Workshop on the Interplay of Model Behavior and Model Internals
INTERPLAY
- Submission deadline
- Jul 11, 2025, 07:55 UTC imported from OpenReview — check the website for extensions
- Submission portal
- OpenReview
- Notes
- Auto-imported from the OpenReview venue record on 2026-06-11 — please verify and enrich (topics are keyword-guessed).
Accepted papers (22)
Fetched from OpenReview (v2) on 2026-06-11.
-
Analyzing Representational Shifts in Multimodal Models: A Study of Feature Dynamics in Gemma and PaliGemma
-
Angular Steering: Behavior Control via Rotation in Activation Space
-
Attributing Response to Context: A Jensen–Shannon Divergence Driven Mechanistic Study of Context Attribution in Retrieval-Augmented Generation
-
BERTology in the Modern World
-
Causal Interventions Reveal Shared Structure Across English Filler–Gap Constructions
-
Comparing Prompt and Representation Engineering for Personality Control in Language Models: A Case Study
-
Death by a Thousand Directions: Exploring the Geometry of Harmfulness in LLMs through Subconcept Probing
-
Emotions Where Art Thou: Understanding and Characterizing the Emotional Latent Space of Large Language Models
-
Evaluating Contrast Localizer for Identifying Causal Units in Social & Mathematical Tasks in Language Models
-
From Indirect Object Identification to Syllogisms: Exploring Binary Mechanisms in Transformer Circuits
-
How Post-Training Reshapes LLMs: A Mechanistic View on Knowledge, Truthfulness, Refusal, and Confidence
-
Interpreting the Latent Structure of Operator Precedence in Language Models
-
LLM Microscope: What Model Internals Reveal About Answer Correctness and Context Utilization
-
Localizing Persona Representations in LLMs
-
On the Geometry of Semantics in Next-token Prediction
-
One-shot Optimized Steering Vectors Mediate Safety-relevant Behaviors in LLMs
-
Predicting Success of Model Editing via Intrinsic Features
-
Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking
-
Safety Subspaces are Not Distinct: A Fine-Tuning Case Study
-
Stochastic Chameleons: Irrelevant Context Hallucinations Reveal Class-Based (Mis)Generalization in LLMs
-
Understanding In-context Learning of Addition via Activation Subspaces
-
Universal Neurons in GPT-2: Emergence, Persistence, and Functional Impact