ICML 2026 Past Other

ICML 2026 Workshop on Weight-Space Symmetries: from Foundations to Practical Applications

ICML 2026 Workshop WSS

Submission deadline
May 8, 2026, 23:59 AoE (UTC−12)
imported from OpenReview — check the website for extensions
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (51)

Fetched from OpenReview (v2) on 2026-06-10.

  1. A Geometric View of Model Merging: Quotient Fréchet Averages from Toy Models to LoRA

    Marvin F. da Silva, Mohammed Adnan, Felix Dangel, Sageev Oore
  2. Access Sets Matter: Budgeting Expert Reads for Scalable Weight-Space Model Merging

    Yuanyi Wang, Yanggan Gu, Su Lu, Yifan Yang, Zhaoyi Yan, Congkai Xie, Jianmin Wu, Hongxia Yang · PDF
  3. Analyzing Stream Collapse in Hyper-Connections: From Diagnosis to Mitigation

    Ekaterina Alimaskina, Gleb Molodtsov, Aleksandr Beznosikov · PDF
  4. Are we Merging the Right Models? Impact of Expert Training Duration on Model Merging for LLMs

    Nikita Kozodoi, Zainab Afolabi, Jack Butler
  5. Attention Weight Decomposition for Vision Model Compression

    Hyunwoo Yu, Yubin Cho, Kyeongbo Kong, Suk-Ju Kang
  6. Auditing Neural Thickets with Low-Rank Routes

    Miroslav Lžičař
  7. Beyond Pairwise: Diagnosing Higher-Order Merge Failures via Hodge Decomposition

    Dongzhe Zheng, Christine Allen-Blanchette
  8. Beyond Structural Symmetries: Linear Mode Connectivity via Neuron Identifiability

    Vincent Bürgin, Daniel Herbst, Ya-Wei Eileen Lin, Stefanie Jegelka · PDF
  9. Block-Level Weight-Space Structure Persists Under Post-Training: An Empirical Study Across LLM Families

    Zhaohui Geoffrey Wang · PDF
  10. Breaking Random-Init Symmetry: Theory-Informed Initialization for ReLU Networks

    Anass Al ammiri
  11. Debugging ReBasin: What Limits Symmetry-Based Model Merging?

    Elliot Stein, Christine Evers, Jonathon Hare
  12. Diagonalizing the Softmax: Hadamard Initialization for Tractable Cross-Entropy Dynamics

    Connall Garrod, Jonathan P. Keating, Christos Thrampoulidis · PDF
  13. Different Layers, Different Manifolds: Module-Wise Weight-Space Geometry in Transformer Optimization

    Kirato Yoshihara
  14. DotResize: Reducing LLM Width via Discrete Optimal Transport-based Neuron Merging

    Neha Verma, Kenton Murray, Kevin Duh
  15. Endpoint Symmetry for Edge Updates: Weight-Space Redundancy in GNNs on Undirected Graphs

    Charlotte Cambier van Nooten, Stijn van den Beemt, Yuliya Shapovalova, Tom Heskes
  16. Flow Equivariant Transformers

    Ibrahim Khaliliya, T. Anderson Keller
  17. Generic Fibers and Functional Dimension of Multi-Head Attention

    Nathan W. Henry
  18. Hierarchical Mixture-of-Experts with Two-Stage Optimization

    Gleb Molodtsov, Alexander Miasnikov, Aleksandr Beznosikov · PDF
  19. How Deep Are Deep GPs, Really? A Sharp Threshold and a Non-Gaussian Limit for Compositional GPs

    Mark Kozdoba, Shie Mannor
  20. How the Optimizer Shapes Learned Solutions in Equivariant Neural Networks

    Teodor-Mihai Stupariu, Andrei Manolache
  21. Iterative Magnitude Pruning Reduces Weight-Space Coupling

    Lucas Perez, Mariana Ordones Oliveira Soares, Jackson de Faria, Fabricio Murai, Renato M. Assunção
  22. Low-Rank Networks Recover Weight and Functional Symmetry Better

    Janis Aiad
  23. LS-Merge: Merging Language Models in Latent Space

    Bedionita Soro, Aoxuan Silvia Zhang, Bruno Andreis, Jaehyeong Jo, Song Chong, Sung Ju Hwang
  24. Meta-Merging by Checkpoint Nowcasting

    Albert Manuel Orozco Camacho, Boris Knyazev, Eugene Belilovsky, Guy Wolf
  25. Model Merging by Output-Space Projection

    Bethan Evans, Benjamin Etheridge, S Roberts, Jared Tanner
  26. Model Merging via Averaged Representational Similarity

    Christopher Wang, Vighnesh Subramaniam, Dan Gutfreund, Boris Katz, Phillip Isola, Brian Cheung
  27. MoRE: Mixture of Reused Experts

    Eric S. Qiu, Utku Umur ACIKALIN, Justin Lovelace, Christian Belardi, Arjun B. Mulchandani, Carla P Gomes, Kilian Q Weinberger
  28. No Global Gauge in Neural Weight Space: Branched Quotient Geometry and Atlas-Optimal Learning

    Manoj Saravanan, Rohit Kumar Salla
  29. Objective-Specific Privileged Bases via Full-Prefix Matryoshka Learning

    Arghamitra Talukder, Philippe Chlenski, Itsik Pe'er
  30. On the Interplay of Priors and Overparametrization in Bayesian Neural Network Posteriors

    Julius Kobialka, Emanuel Sommer, Chris Kolb, Juntae Kwon, Daniel Dold, David Rügamer
  31. Parameter symmetries determine representational geometry in overparameterized nonlinear networks

    Marvin Theiss, Lukas Braun, Andrew M Saxe, Erin Grant
  32. Pre-Normalization Momentum Governs Optimizer-Induced Rank Bias

    Raghav Kaushik Ravi, Srivarshinee Sridhar
  33. Quantifying Symmetries: How Optimisers Impact the Functional Dimension

    Johanna Marie Gegenfurtner, Naima Elosegui Borras, Georgios Arvanitidis
  34. Rethinking the Role of Tensor Decompositions in Post-Training LLM Compression

    Artur Zagitov, Alexander Miasnikov, Maxim Krutikov, Artem Tsedenov, Vladimir Aletov, Gleb Molodtsov, Nail Bashirov, Aleksandr Beznosikov · PDF
  35. Rotation Symmetry in Vision Quantization: The Objective Function is the Bottleneck

    Jaewoo Park, Jihae Lee, Yunjeong yong
  36. Scale-Equivariant Alignment: Closing the Residual Barrier After Permutation Matching

    Kaustubh S. Bukkapatnam, Siddharth Karuturi
  37. Scale-Invariant Empirical-Bayes Laplace Approximation for ReLU Networks

    Shivam Pal, Piyush Rai
  38. Sharpness-Aware Minimization Directly on the Boolean Hypercube

    Ba-Hien TRAN · PDF
  39. Shortcuts in the Tail: Debiasing via Post-Hoc Spectral Compression of Fine-Tuning Updates

    Edward Sun, Dmitrii Troitskii · PDF
  40. SIB: Reparameterization of LLMs for Better Learning-Forgetting under SFT

    Albert Catalan-Tatjer, Jonas Geiping
  41. Symmetry Acquisition in Predictive Coding Networks

    Adam Shaw, Jiayu Li, Michael Sperling, Michael Kim, Alvin Jin
  42. Symmetry-Induced Non-Identifiability in Neural Circuit Inference

    Seungwon Yu, Jaeho Yang, Kijung Yoon
  43. T-REX: Tied Recurrence Extraction

    Mozes Jacobs, T. Anderson Keller, Thomas Fel, Bingbin Liu, Richard Hakim, Yilun Du, Demba E. Ba
  44. Task-Restricted Symmetries in Recurrent Weight Space

    Simon Dräger
  45. The GL(r) Gauge Symmetry of LoRA: Principal Bundle Structure, Loss Landscape Geometry, and Implications for Adapter Merging

    Siddharth Karuturi, Kaustubh S. Bukkapatnam, Laksh Patel, Tanush Ajay Shastry
  46. The Role of Symmetry in Optimizing Overparameterized Networks

    Kusha Sareen, Mohammad Pedramfar, Sékou-Oumar Kaba, Mehran Shakerinava, Siamak Ravanbakhsh
  47. Toward a Type-Theoretic Framework for Linear Mode Connectivity: Univalence and Path-Finding in Weight Spaces

    Şuayp Talha Kocabay, Kerem Yalçın, Talha Rüzgar Akkuş, Erik Hillbom
  48. WARP: Weight-Space Analysis for Recovering Training Data Portfolios

    Tzu-Heng Huang, Aditya Goyal, John Cooper, Frederic Sala
  49. Weight Space Representation Learning via Neural Field Adaptation

    Zhuoqian Yang, Mathieu Salzmann, Sabine Süsstrunk
  50. What Survives of Path Norms? Path-Lifting as an Intermediate Representation for ReLU Networks

    Antoine Gonon, Rémi Gribonval · PDF
  51. WK, WV is (Linearly) All You Need: On the Necessity of the QKV Weight Triplet in Self-Attention Transformers

    Marko Karbevski, Antonij Mijoski