ICML 2026 Past Other
ICML 2026 Workshop on Weight-Space Symmetries: from Foundations to Practical Applications
ICML 2026 Workshop WSS
- Submission deadline
- May 8, 2026, 23:59 AoE (UTC−12) imported from OpenReview — check the website for extensions
- Submission portal
- OpenReview
- Notes
- Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).
Accepted papers (51)
Fetched from OpenReview (v2) on 2026-06-10.
-
A Geometric View of Model Merging: Quotient Fréchet Averages from Toy Models to LoRA
-
Access Sets Matter: Budgeting Expert Reads for Scalable Weight-Space Model Merging
-
Analyzing Stream Collapse in Hyper-Connections: From Diagnosis to Mitigation
-
Are we Merging the Right Models? Impact of Expert Training Duration on Model Merging for LLMs
-
Attention Weight Decomposition for Vision Model Compression
-
Auditing Neural Thickets with Low-Rank Routes
-
Beyond Pairwise: Diagnosing Higher-Order Merge Failures via Hodge Decomposition
-
Beyond Structural Symmetries: Linear Mode Connectivity via Neuron Identifiability
-
Block-Level Weight-Space Structure Persists Under Post-Training: An Empirical Study Across LLM Families
-
Breaking Random-Init Symmetry: Theory-Informed Initialization for ReLU Networks
-
Debugging ReBasin: What Limits Symmetry-Based Model Merging?
-
Diagonalizing the Softmax: Hadamard Initialization for Tractable Cross-Entropy Dynamics
-
Different Layers, Different Manifolds: Module-Wise Weight-Space Geometry in Transformer Optimization
-
DotResize: Reducing LLM Width via Discrete Optimal Transport-based Neuron Merging
-
Endpoint Symmetry for Edge Updates: Weight-Space Redundancy in GNNs on Undirected Graphs
-
Flow Equivariant Transformers
-
Generic Fibers and Functional Dimension of Multi-Head Attention
-
Hierarchical Mixture-of-Experts with Two-Stage Optimization
-
How Deep Are Deep GPs, Really? A Sharp Threshold and a Non-Gaussian Limit for Compositional GPs
-
How the Optimizer Shapes Learned Solutions in Equivariant Neural Networks
-
Iterative Magnitude Pruning Reduces Weight-Space Coupling
-
Low-Rank Networks Recover Weight and Functional Symmetry Better
-
LS-Merge: Merging Language Models in Latent Space
-
Meta-Merging by Checkpoint Nowcasting
-
Model Merging by Output-Space Projection
-
Model Merging via Averaged Representational Similarity
-
MoRE: Mixture of Reused Experts
-
No Global Gauge in Neural Weight Space: Branched Quotient Geometry and Atlas-Optimal Learning
-
Objective-Specific Privileged Bases via Full-Prefix Matryoshka Learning
-
On the Interplay of Priors and Overparametrization in Bayesian Neural Network Posteriors
-
Parameter symmetries determine representational geometry in overparameterized nonlinear networks
-
Pre-Normalization Momentum Governs Optimizer-Induced Rank Bias
-
Quantifying Symmetries: How Optimisers Impact the Functional Dimension
-
Rethinking the Role of Tensor Decompositions in Post-Training LLM Compression
-
Rotation Symmetry in Vision Quantization: The Objective Function is the Bottleneck
-
Scale-Equivariant Alignment: Closing the Residual Barrier After Permutation Matching
-
Scale-Invariant Empirical-Bayes Laplace Approximation for ReLU Networks
-
Sharpness-Aware Minimization Directly on the Boolean Hypercube
-
Shortcuts in the Tail: Debiasing via Post-Hoc Spectral Compression of Fine-Tuning Updates
-
SIB: Reparameterization of LLMs for Better Learning-Forgetting under SFT
-
Symmetry Acquisition in Predictive Coding Networks
-
Symmetry-Induced Non-Identifiability in Neural Circuit Inference
-
T-REX: Tied Recurrence Extraction
-
Task-Restricted Symmetries in Recurrent Weight Space
-
The GL(r) Gauge Symmetry of LoRA: Principal Bundle Structure, Loss Landscape Geometry, and Implications for Adapter Merging
-
The Role of Symmetry in Optimizing Overparameterized Networks
-
Toward a Type-Theoretic Framework for Linear Mode Connectivity: Univalence and Path-Finding in Weight Spaces
-
WARP: Weight-Space Analysis for Recovering Training Data Portfolios
-
Weight Space Representation Learning via Neural Field Adaptation
-
What Survives of Path Norms? Path-Lifting as an Intermediate Representation for ReLU Networks
-
WK, WV is (Linearly) All You Need: On the Necessity of the QKV Weight Triplet in Self-Attention Transformers