NeurIPS 2024 Past AI for science

NeurIPS 2024 Workshop on Scientific Methods for Understanding Deep Learning

SciForDL

Submission deadline
Sep 18, 2024, 12:59 UTC
imported from OpenReview — check the website for extensions
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (73)

Fetched from OpenReview (v2) on 2026-06-10.

  1. A Continuous-Time Analysis of Adaptive Optimization and Normalization

    Rhys Gould, Hidenori Tanaka · PDF
  2. A Method on Searching Better Activation Functions

    Haoyuan Sun, Zihao Wu, Bo Xia, Pu Chang, Zibin Dong, Yifu Yuan, Yongzhe Chang, Xueqian Wang · PDF
  3. Alice in Wonderland: Simple Tasks Reveal Severe Generalization and Basic Reasoning Deficits in State-Of-the-Art Large Language Models

    Marianna Nezhurina, Lucia Cipolina-Kun, Mehdi Cherti, Jenia Jitsev · PDF
  4. Amplified Early Stopping Bias: Overestimated Performance with Deep Learning

    Nona Rajabi, Antonio H. Ribeiro, Miguel Vasco, Danica Kragic · PDF
  5. Are Capsule Networks Texture or Shape Biased?

    Riccardo Renzulli, Dominik Vranay, Marco Grangetto · PDF
  6. BatchTopK Sparse Autoencoders

    Bart Bussmann, Patrick Leask, Neel Nanda · PDF
  7. Causation Does Not Imply Correlation: A Study of Circuit Mechanisms and Model Behaviors

    Jenny Kaufmann, Victoria R Li, Martin Wattenberg, David Alvarez-Melis, Naomi Saphra · PDF
  8. Characterizing stable regions in the residual stream of LLMs

    Jett Janiak, Jacek Karwowski, Chatrik Singh Mangat, Giorgi Giglemiani, Nora Petrova, Stefan Heimersheim · PDF
  9. Comparing Apples and Oranges: is Stitching Similarity a Load of Spheres?

    Damian Smith, Antonia Marcu · PDF
  10. Denoising for Manifold Extrapolation

    Zeyu Yun, Galen Chuang, Derek Dong, Yubei Chen · PDF
  11. Distributional Scaling Laws for Emergent Capabilities

    Rosie Zhao, Naomi Saphra, Sham M. Kakade · PDF
  12. Effectiveness of Sparse Autoencoder for understanding and removing gender bias in LLMs

    Praveen Hegde · PDF
  13. Eliminating Position Bias of Language Models: A Mechanistic Approach

    Ziqi Wang, Hanlin Zhang, Xiner Li, Kuan-Hao Huang, Chi Han, Shuiwang Ji, Sham M. Kakade, Hao Peng, Heng Ji · PDF
  14. Emergence of Hierarchical Emotion Representations in Large Language Models

    Bo Zhao, Maya Okawa, Eric J Bigelow, Rose Yu, Tomer Ullman, Hidenori Tanaka · PDF
  15. Emergent properties with repeated examples

    Francois Charton, Julia Kempe · PDF
  16. EmoCAM: Toward Understanding What Drives CNN-based Emotion Recognition

    Youssef Doulfoukar, Laurent Mertens, Joost Vennekens · PDF
  17. Evaluating Loss Landscapes from a Topology Perspective

    Tiankai Xie, Caleb Geniesse, Jiaqing Chen, Yaoqing Yang, Dmitriy Morozov, Michael W. Mahoney, Ross Maciejewski, Gunther H. Weber · PDF
  18. Explicit Regularisation, Sharpness and Calibration

    Israel Mason-Williams, Fredrik Ekholm, Ferenc Huszár · PDF
  19. Exploiting Interpretable Capabilities with Concept-Enhanced Diffusion and Prototype Networks

    Alba Carballo-Castro, Sonia Laguna, Moritz Vandenhirtz, Julia E Vogt · PDF
  20. Exploring model depth and data complexity through the lens of cellular automata

    Tianyu He, Darshil Doshi, Aritra Das, Andrey Gromov · PDF
  21. Generalization vs Specialization under Concept Shift

    Alex Nguyen, David J. Schwab, Vudtiwat Ngampruetikorn · PDF
  22. Hiding in a Plain Sight: Out-of-Distribution Data in the Logit Space Embeddings

    Vangjush Kostandin Komini, Sarunas Girdzijauskas · PDF
  23. How Learning Rates Shape Neural Network Focus: Insights from Example Ranking

    Ekaterina Lobacheva, Keller Jordan, Aristide Baratin, Nicolas Le Roux · PDF
  24. How rare events shape the learning curves of hierarchical data

    Hyunmo Kang, Francesco Cagnetta, Matthieu Wyart · PDF
  25. Illusions as features: the generative side of recognition

    Tahereh Toosi, Kenneth D. Miller · PDF
  26. Impact of Label Noise on Learning Complex Features

    Rahul Vashisht, P Krishna Kumar, Harsha Vardhan Govind, Harish Guruprasad Ramaswamy · PDF
  27. Improving Deep Learning Speed and Performance through Synaptic Neural Balance

    Antonios Alexos, Ian Domingo, Pierre Baldi · PDF
  28. Input Space Mode Connectivity in Deep Neural Networks

    Jakub Vrabel, Ori Shem-Ur, Yaron Oz, David Krueger · PDF
  29. Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations

    Kola Ayonrinde, Michael T Pearce, Lee Sharkey · PDF
  30. Investigating Sensitive Directions in GPT-2: An Improved Baseline and Comparative Analysis of SAEs

    Daniel J Lee, Stefan Heimersheim · PDF
  31. Is Expressivity Essential for the Predictive Performance of Graph Neural Networks?

    Fabian Jogl, Pascal Welke, Thomas Gärtner · PDF
  32. Is network fragmentation a useful complexity measure?

    Coenraad Mouton, Randle Rabe, Daniël Gerbrand Haasbroek, Marthinus Wilhelmus Theunissen, Hermanus Lambertus Potgieter, Marelie Hattingh Davel · PDF
  33. Is Saliency Really Captured By Gradient?

    Nehal Yasin, Jonathon Hare, Antonia Marcu · PDF
  34. Knowledge Distillation for Teaching Symmetry Invariances

    Patrick Odagiu, Nicole Nobili, Fabian Dionys Schrag, Yves Bicker, Yuhui Ding · PDF
  35. Knowledge Distillation: The Functional Perspective

    Israel Mason-Williams, Gabryel Mason-Williams, Mark Sandler · PDF
  36. Language model scaling laws and zero-sum learning

    Andrei Mircea, Ekaterina Lobacheva, Supriyo Chakraborty, Nima Chitsazan, Irina Rish · PDF
  37. Learnability in the Context of Neural Tangent Kernels

    Progyan Das, Dwip Dalal · PDF
  38. Learned Random Label Predictions as a Neural Network Complexity Metric

    Marlon Becker, Benjamin Risse · PDF
  39. Learning Stochastic Rainbow Networks

    Vivian White, Muawiz Sajjad Chaudhary, Guy Wolf, Guillaume Lajoie, Kameron Decker Harris · PDF
  40. Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference

    Anton Xue, Avishree Khare, Rajeev Alur, Surbhi Goel, Eric Wong · PDF
  41. Memorization to Generalization: The Emergence of Diffusion Models from Associative Memory

    Bao Pham, Gabriel Raya, Matteo Negri, Mohammed J Zaki, Luca Ambrogioni, Dmitry Krotov · PDF
  42. Model Recycling: Model component reuse to promote in-context learning

    Lindsay M. Smith, Chase Goddard, Vudtiwat Ngampruetikorn, David J. Schwab · PDF
  43. On the Collapse Errors Induced by the Deterministic Sampler for Diffusion Models

    Yi Zhang, Difan Zou · PDF
  44. Pre-processing and Compression: Understanding Hidden Representation Refinement Across Imaging Domains via Intrinsic Dimension

    Nicholas Konz, Maciej A Mazurowski · PDF
  45. Probing the Decision Boundaries of In-context Learning in Large Language Models Download PDF

    Siyan Zhao, Tung Nguyen, Aditya Grover · PDF
  46. Rethinking Knowledge Transfer in Learning Using Privileged Information

    Danil Provodin, Bram van den Akker, Christina Katsimerou, Maurits Clemens Kaptein, Mykola Pechenizkiy · PDF
  47. Revealing the Learning Process in Reinforcement Learning Agents Through Attention-Oriented Metrics

    Charlotte Beylier, Simon M. Hofmann, Nico Scherf · PDF
  48. Robust Learning in Bayesian Parallel Branching Graph Neural Networks: The Narrow Width Limit

    Zechen Zhang, Haim Sompolinsky · PDF
  49. softmax is not enough (for sharp out-of-distribution)

    Petar Veličković, Christos Perivolaropoulos, Federico Barbero, Razvan Pascanu · PDF
  50. SolidMark: How to Evaluate Memorization in Image Generative Models

    Nicky Kriplani, Minh Pham, Malikka Rajshahi, Chinmay Hegde, Niv Cohen · PDF
  51. Sometimes I am a Tree: Data Drives Fragile Hierarchical Generalization

    Tian Qin, Naomi Saphra, David Alvarez-Melis · PDF
  52. Sparse autoencoders for dense text embeddings reveal hierarchical feature sub-structure

    Christine Ye, Charles O'Neill, John F Wu, Kartheik G. Iyer · PDF
  53. Specialization-generalization transition in exemplar-based in-context learning

    Chase Goddard, Lindsay M. Smith, Vudtiwat Ngampruetikorn, David J. Schwab · PDF
  54. Standard adversarial attacks only fool the final layer

    Stanislav Fort · PDF
  55. Stitching Sparse Autoencoders of Different Sizes

    Patrick Leask, Bart Bussmann, Joseph Isaac Bloom, Curt Tigges, Noura Al Moubayed, Neel Nanda · PDF
  56. Structure Development in List Sorting Transformers

    Einar Urdshals, Jasmina nasufi · PDF
  57. Structured Identity Mapping Learning As a Model for Compositional Generalization in Generative Models

    Yongyi Yang, Core Francisco Park, Ekdeep Singh Lubana, Maya Okawa, Wei Hu, Hidenori Tanaka · PDF
  58. Testing knowledge distillation theories with dataset size

    Giulia Lanzillotta, Felix Sarnthein, Gil Kur, Thomas Hofmann, Bobby He · PDF
  59. The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains

    Ezra Edelman, Nikolaos Tsilivis, Surbhi Goel, Benjamin L. Edelman, eran malach · PDF
  60. The Master Key Filters Hypothesis: Deep Filters Are General

    Zahra Babaiee, Peyman Kiasari, Daniela Rus, Radu Grosu · PDF
  61. The Pitfalls of Memorization: When Memorization Hinders Generalization

    Reza Bayat, Mohammad Pezeshki, Elvis Dohmatob, David Lopez-Paz, Pascal Vincent · PDF
  62. The Unreasonable Ineffectiveness of the Deeper Layers

    Andrey Gromov, Kushal Tirumala, Hassan Shapourian, Paolo Glorioso, Dan Roberts · PDF
  63. Token-token correlations predict the scaling of the test loss with the number of input tokens

    Francesco Cagnetta, Matthieu Wyart · PDF
  64. Towards Understanding In-Context Learning with Contrastive Demonstrations and Saliency Maps

    Fuxiao Liu · PDF
  65. Training Dynamics of Convolutional Neural Networks for Learning the Derivative Operator

    Erik Y. Wang, Yongji Wang, Ching-Yao Lai · PDF
  66. Training Neural Networks for Modularity aids Interpretability

    Satvik Golechha, Dylan Cope, Nandi Schoots · PDF
  67. Transformers can reinforcement learn to approximate Gittins Index

    Vladimir Petrov, Nikhil Vyas, Lucas Janson · PDF
  68. Twin Studies of Factors in OOD Generalization

    Victoria R Li, Jenny Kaufmann, David Alvarez-Melis, Naomi Saphra · PDF
  69. Understanding the Limitations of B-Spline KANs: Convergence Dynamics and Computational Efficiency

    Avik Pal, Dipankar Das · PDF
  70. Understanding the Transient Nature of In-Context Learning: The Window of Generalization

    Core Francisco Park, Ekdeep Singh Lubana, Hidenori Tanaka · PDF
  71. Understanding Visual Concepts Across Models

    Brandon Trabucco, Max A Gurinas, Kyle Doherty, Russ Salakhutdinov · PDF
  72. Unraveling the Latent Hierarchical Structure of Language and Images via Diffusion Models

    Antonio Sclocchi, Noam Itzhak Levi, Alessandro Favero, Matthieu Wyart · PDF
  73. We Need Far Fewer Unique Filters Than We Thought

    Zahra Babaiee, Peyman Kiasari, Daniela Rus, Radu Grosu · PDF