NeurIPS 2024 Past AI for science
NeurIPS 2024 Workshop on Scientific Methods for Understanding Deep Learning
SciForDL
- Submission deadline
- Sep 18, 2024, 12:59 UTC imported from OpenReview — check the website for extensions
- Submission portal
- OpenReview
- Notes
- Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).
Accepted papers (73)
Fetched from OpenReview (v2) on 2026-06-10.
-
A Continuous-Time Analysis of Adaptive Optimization and Normalization
-
A Method on Searching Better Activation Functions
-
Alice in Wonderland: Simple Tasks Reveal Severe Generalization and Basic Reasoning Deficits in State-Of-the-Art Large Language Models
-
Amplified Early Stopping Bias: Overestimated Performance with Deep Learning
-
Are Capsule Networks Texture or Shape Biased?
-
BatchTopK Sparse Autoencoders
-
Causation Does Not Imply Correlation: A Study of Circuit Mechanisms and Model Behaviors
-
Characterizing stable regions in the residual stream of LLMs
-
Comparing Apples and Oranges: is Stitching Similarity a Load of Spheres?
-
Denoising for Manifold Extrapolation
-
Distributional Scaling Laws for Emergent Capabilities
-
Effectiveness of Sparse Autoencoder for understanding and removing gender bias in LLMs
-
Eliminating Position Bias of Language Models: A Mechanistic Approach
-
Emergence of Hierarchical Emotion Representations in Large Language Models
-
Emergent properties with repeated examples
-
EmoCAM: Toward Understanding What Drives CNN-based Emotion Recognition
-
Evaluating Loss Landscapes from a Topology Perspective
-
Explicit Regularisation, Sharpness and Calibration
-
Exploiting Interpretable Capabilities with Concept-Enhanced Diffusion and Prototype Networks
-
Exploring model depth and data complexity through the lens of cellular automata
-
Generalization vs Specialization under Concept Shift
-
Hiding in a Plain Sight: Out-of-Distribution Data in the Logit Space Embeddings
-
How Learning Rates Shape Neural Network Focus: Insights from Example Ranking
-
How rare events shape the learning curves of hierarchical data
-
Illusions as features: the generative side of recognition
-
Impact of Label Noise on Learning Complex Features
-
Improving Deep Learning Speed and Performance through Synaptic Neural Balance
-
Input Space Mode Connectivity in Deep Neural Networks
-
Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations
-
Investigating Sensitive Directions in GPT-2: An Improved Baseline and Comparative Analysis of SAEs
-
Is Expressivity Essential for the Predictive Performance of Graph Neural Networks?
-
Is network fragmentation a useful complexity measure?
-
Is Saliency Really Captured By Gradient?
-
Knowledge Distillation for Teaching Symmetry Invariances
-
Knowledge Distillation: The Functional Perspective
-
Language model scaling laws and zero-sum learning
-
Learnability in the Context of Neural Tangent Kernels
-
Learned Random Label Predictions as a Neural Network Complexity Metric
-
Learning Stochastic Rainbow Networks
-
Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference
-
Memorization to Generalization: The Emergence of Diffusion Models from Associative Memory
-
Model Recycling: Model component reuse to promote in-context learning
-
On the Collapse Errors Induced by the Deterministic Sampler for Diffusion Models
-
Pre-processing and Compression: Understanding Hidden Representation Refinement Across Imaging Domains via Intrinsic Dimension
-
Probing the Decision Boundaries of In-context Learning in Large Language Models Download PDF
-
Rethinking Knowledge Transfer in Learning Using Privileged Information
-
Revealing the Learning Process in Reinforcement Learning Agents Through Attention-Oriented Metrics
-
Robust Learning in Bayesian Parallel Branching Graph Neural Networks: The Narrow Width Limit
-
softmax is not enough (for sharp out-of-distribution)
-
SolidMark: How to Evaluate Memorization in Image Generative Models
-
Sometimes I am a Tree: Data Drives Fragile Hierarchical Generalization
-
Sparse autoencoders for dense text embeddings reveal hierarchical feature sub-structure
-
Specialization-generalization transition in exemplar-based in-context learning
-
Standard adversarial attacks only fool the final layer
-
Stitching Sparse Autoencoders of Different Sizes
-
Structure Development in List Sorting Transformers
-
Structured Identity Mapping Learning As a Model for Compositional Generalization in Generative Models
-
Testing knowledge distillation theories with dataset size
-
The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains
-
The Master Key Filters Hypothesis: Deep Filters Are General
-
The Pitfalls of Memorization: When Memorization Hinders Generalization
-
The Unreasonable Ineffectiveness of the Deeper Layers
-
Token-token correlations predict the scaling of the test loss with the number of input tokens
-
Towards Understanding In-Context Learning with Contrastive Demonstrations and Saliency Maps
-
Training Dynamics of Convolutional Neural Networks for Learning the Derivative Operator
-
Training Neural Networks for Modularity aids Interpretability
-
Transformers can reinforcement learn to approximate Gittins Index
-
Twin Studies of Factors in OOD Generalization
-
Understanding the Limitations of B-Spline KANs: Convergence Dynamics and Computational Efficiency
-
Understanding the Transient Nature of In-Context Learning: The Window of Generalization
-
Understanding Visual Concepts Across Models
-
Unraveling the Latent Hierarchical Structure of Language and Images via Diffusion Models
-
We Need Far Fewer Unique Filters Than We Thought