COLM 2025 Past Large language modelsFairness & ethics
Workshop on Socially Responsible Language Modelling Research
COLM 2025 Workshop SoLaR
- Submission deadline
- Jun 28, 2025, 11:59 UTC imported from OpenReview — check the website for extensions
- Submission portal
- OpenReview
- Notes
- Auto-imported from the OpenReview venue record on 2026-06-11 — please verify and enrich (topics are keyword-guessed).
Accepted papers (25)
Fetched from OpenReview (v2) on 2026-06-11.
-
A Generative Approach to LLM Harmfulness Mitigation with Red Flag Tokens
-
A Study of Large Language Models for Extraction of Themes from Homeless Shelter Case Notes
-
Accidental Vulnerability: Factors in Fine-Tuning that Shift Model Safeguards
-
CONECUT: Scalable Removal of Preference Redundancy
-
CourtReasoner: Can LLM Agents Reason Like Judges?
-
Detecting Biased Language in Icelandic: A Named Entity Recognition Approach for Socially Responsible Text Analysis
-
IMPersona: Evaluating Individual Level LM Impersonation
-
Investigating Model Editing for Unlearning in Large Language Models
-
Large Language Models in the Task of Automatic Validation of Text Classifier Predictions
-
LLMs are Vulnerable to Malicious Prompts Disguised as Scientific Language
-
LLMs on Trial: Evaluating Judicial Fairness for Large Language Models
-
MCP Safety Training: Learning to Refuse Falsely Benign MCP Exploits using Improved Preference Alignment
-
MedPAIR: Measuring Physicians and AI Relevance Alignment in Medical Question Answering
-
Multi-Turn Jailbreaks Are Simpler Than They Seem
-
Neither Valid nor Reliable? Investigating the Use of LLMs as Judges
-
Poor Alignment and Steerability of Large Language Models: Evidence Using 30,000 College Admissions Essays
-
Practical Evaluation of Machine Learning Efficiency Requires Model Life Cycle Assessment
-
Privacy-Preserving LLM Interaction with Socratic Chain-of-Thought Reasoning and Homomorphically Encrypted Vector Databases
-
Prompt Attacks Reveal Superficial Knowledge Removal in Unlearning Methods
-
Red Teaming Vision Language Models Under Change
-
Sarc7: Evaluating Sarcasm Detection and Generation with Seven Types and Emotion-Informed Techniques
-
The Alignment Game: The Inevitable Conflict of Values in Generative Models
-
Towards Attuned AI: Integrating Care Ethics in Large Language Model Development and Alignment
-
TRUTH: Teaching LLMs to Rerank for Truth in Misinformation Detection
-
When Do Language Models Endorse Limitations on Universal Human Rights Principles?