NeurIPS 2024PastEfficiency

Workshop on Machine Learning and Compression, NeurIPS 2024

Compression Workshop @ NeurIPS 2024

Official website ↗OpenReview venue ↗See all NeurIPS workshops →✎ Edit this entry

Submission deadline: Oct 1, 2024, 11:59 UTC
imported from OpenReview — check the website for extensions
Submission portal: OpenReview
Notes: Topics were auto-suggested and may be imprecise — edits welcome.

Accepted papers (95)

Fetched from OpenReview (v2) on 2026-06-10.

A Theory for Compressibility of Graph Transformers for Transductive Learning
Hamed Shirzad, Honghao Lin, Ameya Velingker, Balaji Venkatachalam, David Woodruff, Danica J. Sutherland · PDF
A Tighter Complexity Analysis of SparseGPT
Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song · PDF
Accelerating Memory-Efficient LLM Training and Fine-Tuning via Tracking the Gradient Subspace
Sahar Rajabi, Sirisha Rambhatla · PDF
Adapting Language Models via Token Translation
Zhili Feng, Tanya Marwah, Lester Mackey, David Alvarez-Melis, Nicolo Fusi · PDF
Adaptive Quantization and Pruning of Deep Neural Networks via Layer Importance Estimation
Tushar Shinde · PDF
AdaQuantLM: LLM Quantization with Adaptive Bit-Widths
Shuangyi Chen, Ashish J Khisti · PDF
An image to tailor: I-Frame Domain Adaptation in Neural Video Compression
Alberto Presta, Gabriele Spadaro, Attilio Fiandrotti, Marco Grangetto · PDF
An Information Theory of Compute-Optimal Size Scaling, Emergence, and Plateaus in Language Models
Anuj K. Nayak, Lav R. Varshney · PDF
Benchmarking neural lossless compression algorithms on multi-purpose astronomical image data
Tuan Truong, Rithwik Sudharsan, Yibo Yang, Peter Xiangyuan Ma, Ruihan Yang, Stephan Mandt, Joshua S. Bloom · PDF
BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models
Xingyu Zheng, Xianglong Liu, Haotong Qin, Xudong Ma, Mingyuan Zhang, Haojie Hao, Jiakai Wang, Zixiang Zhao, Jinyang Guo, Michele Magno · PDF
Breaking Smoothness: The Struggles of Neural Compressors with Discontinuous Mappings
Ezgi Ozyilkan, Jona Ballé, Sourbh Bhadane, Aaron B. Wagner, Elza Erkip · PDF
Bridging the Gap between Diffusion Models and Universal Quantization for Image Compression
Lucas Relic, Roberto Azevedo, Yang Zhang, Markus Gross, Christopher Schroers · PDF
CDQuant: Greedy Coordinate Descent for Accurate LLM Quantization
Pranav Ajit Nair, Arun Suggala · PDF
Communication Compression for Tensor Parallel LLM Inference
Jan Hansen-Palmus, Michael Truong Le, Oliver Hausdörfer, Alok Verma · PDF
Compressing Recurrent Neural Networks for FPGA-accelerated Implementation in Fluorescence Lifetime Imaging
Ismail Erbas, Vikas Pandey, Aporva Amarnath, Naigang Wang, Karthik Swaminathan, Stefan T. Radev, Xavier Intes · PDF
Conditional Hallucinations for Image Compression
Till Aczel, Roger Wattenhofer · PDF
Copula-based Estimation of Continuous Sources for a Class of Constrained Rate-Distortion Functions
Giuseppe Serra, Photios A. Stavrou, Marios Kountouris · PDF
Deep Clustering with Associative Memories
Bishwajit Saha, Dmitry Krotov, Mohammed J Zaki, Parikshit Ram · PDF
Dense Backpropagation Improves Routing for Sparsely-Gated Mixture-of-Experts
Ashwinee Panda, Vatsal Baherwani, Zain Sarwar, Benjamin Thérien, Stephen Rawls, Sambit Sahu, Supriyo Chakraborty, Tom Goldstein · PDF
Differentiable Attention
Yancheng Wang, Dongfang Sun, Yingzhen Yang · PDF
Diffusion Models With Learned Adaptive Noise
Subham Sekhar Sahoo, Aaron Gokaslan, Christopher De Sa, Volodymyr Kuleshov · PDF
Distillation of Discrete Diffusion through Dimensional Correlations
Satoshi Hayakawa, Yuhta Takida, Masaaki Imaizumi, Hiromi Wakaki, Yuki Mitsufuji · PDF
Does Representation Matter? Exploring Intermediate Layers in Large Language Models
Oscar Skean, Md Rifat Arefin, Ravid Shwartz-Ziv · PDF
EAMQ: Environment-based Adaptive Model Quantization on Federated Reinforcement Learning
YU CHENYUE · PDF
Efficient and Robust Spike Ensemble Coding of Signals
Anik Chattopadhyay, Arunava Banerjee · PDF
Efficient Compression of Sparse Accelerator Data Using Implicit Neural Representations and Importance Sampling
Xihaier Luo, Samuel Lurvey, Yi Huang, Yihui Ren, Jin Huang, Byung-Jun Yoon · PDF
Efficient Model Compression Techniques with FishLeg
Jamie McGowan, Wei Sheng Lai, Weibin Chen, Henry Aldridge, Jools Clarke, Jezabel R Garcia, Rui Xia, Yilei Liang, Guillaume Hennequin, Alberto Bernacchia · PDF
Empirical Upper Bounds for Unstructured Sparsity in Compute-Efficient Language Modeling
Esha Singh, Shane Bergsma, Nolan Simran Dey, Joel Hestness, Gavia Gray · PDF
EXAQ: Exponent Aware Quantization For LLMs Acceleration
Moran Shkolnik, Maxim Fishman, Brian Chmiel, Hilla Ben-Yaacov, Ron Banner, Kfir Yehuda Levy · PDF
Exploiting Temporal Priors for Efficient Real-time Compression and Feedback of Wireless Channels
Akshay Malhotra, Mohamed Salah Ibrahim, Keya Patani · PDF
FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models
Yang Zhang, Yawei Li, Xinpeng Wang, Qianli Shen, Barbara Plank, Bernd Bischl, Mina Rezaei, Kenji Kawaguchi · PDF
Flexible image decoding in learned image compression
Hossein Motamednia, Azadeh Mansouri, Fariba Saadati Monem, Ahmad Mahmoudi-Aznaveh · PDF
Formalizing Limits of Knowledge Distillation Using Partial Information Decomposition
Pasan Dissanayake, Faisal Hamman, Barproda Halder, Ilia Sucholutsky, Qiuyi Zhang, Sanghamitra Dutta · PDF
Fused-Layer CNNs for Memory-Efficient Inference on Microcontrollers
Mark Deutel, Frank Hannig, Christopher Mutschler, Jürgen Teich · PDF
FV-NeRV: Neural Compression for Free Viewpoint Videos
Takuya Fujihashi, Sorachi Kato, Toshiaki Koike-Akino · PDF
Getting free Bits Back from Rotational Symmetries in LLMs
Jiajun He, Gergely Flamich, José Miguel Hernández-Lobato · PDF
Graph Transformation Augmentation for Contrastive Learning of Graph-Level Representation: An Initial Exploration
Tianchao Li, Yulong Pei · PDF
Grow to Compress? Efficient Training of Robust Networks on the Edge
Vignesh Sundaresha, Naresh Shanbhag · PDF
How Many Does It Take to Prune a Network: Comparing One-Shot vs. Iterative Pruning Regimes
Tomasz Wojnar, Mikołaj Janusz, Luca Benini, Yawei Li, Kamil Adamczewski · PDF
Improving Knowledge Distillation with Teacher's Explanation
Sayantan Chowdhury, Ben Liang, Ali Tizghadam, Ilijc Albanese · PDF
Information-theoretic Generalization Analysis for Vector-Quantized VAEs
Futoshi Futami, Masahiro Fujisawa · PDF
Integration of Large Vision Models in Driver Monitoring Systems: Compressing and Distilling for Real-Time Automotive Applications
Georgios Markos Chatziloizos, Andrea Ancora, Andrew I. Comport, Barat Christian · PDF
Interactions Across Blocks in Post-Training Quantization of Large Language Models
Khasmamad Shabanovi, Lukas Wiest, Vladimir Golkov, Daniel Cremers, Thomas Pfeil · PDF
Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations
Kola Ayonrinde, Michael T Pearce, Lee Sharkey · PDF
Large Language Model Compression with Neural Architecture Search
Rhea Sanjay Sukthanker, Benedikt Staffler, Frank Hutter, Aaron Klein · PDF
Latent Probabilistic Dataset Distillation with Theoretical Guarantees
Progyan Das, Shrutimoy Das, Anirban Dasgupta · PDF
Layer-Importance guided Adaptive Quantization for Efficient Speech Emotion Recognition
Tushar Shinde, RITIKA JAIN, Avinash Kumar Sharma · PDF
Layer-wise Quantization for Distributed Variational Inequalities
Anh Duc Nguyen, Ilia Markov, Ali Ramezani-Kebrya, Kimon Antonakopoulos, Dan Alistarh, Volkan Cevher · PDF
Learnable Fourier-based Activations for Implicit Signal Representations
Parsa Mojarad Adi, Ali Mehrabian · PDF
Learning to Compress: Local Rank and Information Compression in Deep Neural Networks
Niket Nikul Patel, Ravid Shwartz-Ziv · PDF
LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and Quantization
Rui Xie, Tianchen Zhao, Zhihang Yuan, Rui Wan, Wenxi Gao, Zhenhua Zhu, Xuefei Ning, Yu Wang · PDF
LLM Vocabulary Compression for Low-Compute Environments
Sreeram Vennam, Anish R Joishy, Ponnurangam Kumaraguru · PDF
LORC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
Rongzhi Zhang, Kuan Wang, Liyuan Liu, Shuohang Wang, Hao Cheng, Chao Zhang, yelong shen · PDF
Losslessly Compressible Neural Network Parameters
Matthew Farrugia-Roberts · PDF
LSH-E Tells You What To Discard: An Adaptive Locality-Sensitive Strategy for KV Cache Compression
Tahseen Rabbani, Minghui Liu, Tony O'Halloran, Ananth Sankaralingam, Mary-Anne Hartley, Furong Huang · PDF
M2M-TAG: Training-Free Many-to-Many Token Aggregation for Vision Transformer Acceleration
Fanhu Zeng, Deli Yu · PDF
Majority Kernels: An Approach to Leverage Big Model Dynamics for Efficient Small Model Training
Hanna Mazzawi, Pranjal Awasthi, Javier Gonzalvo, Srikumar Ramalingam · PDF
MAPLE: Memory-Aware Predict and Load for Efficient LLM Inference
Zhenyu Liu, Zhemin Zhang, Zirui Zhang, Yanyuan Qin, Jiayi Luo, Zhenyu Gu, Liu Liu · PDF
MCUCoder: Adaptive Bitrate Learned Video Compression for IoT Devices
Ali Hojjat, Janek Haberer, Olaf Landsiedel · PDF
Mind the Gap Between Synthetic and Real: Probing Transfer Capabilities of Stable Diffusion Images
Leonhard Hennicke, Christian Medeiros Adriano, Holger Giese, Jan Mathias Koehler, Lukas Schott · PDF
Neural Compression for Multispectral Satellite Images
Woojin Cho, Steve Andreas Immanuel, Junhyuk Heo, Darongsae Kwon · PDF
Neural Normalized Compression Distance and the Disconnect Between Compression and Classification
John Hurwitz, Charles K. Nicholas, Edward Raff · PDF
Non-interactive Remote Coordination
Yassine Hamdi, Xueyan Niu, Bo Bai, Deniz Gunduz · PDF
On the Relationship Between Model Training Dynamics and Early Pruning Periods
Elvis Nunez, Stefano Soatto · PDF
P-SpikeSSM: Harnessing Probabilistic Spiking State Space Models for Long-Range Dependency Tasks
Malyaban Bal, Abhronil Sengupta · PDF
Partially Frozen Random Networks Contain Compact Strong Lottery Tickets
Hikari Otsuka, Daiki Chijiwa, Ángel López García-Arias, Yasuyuki Okoshi, Kazushi Kawamura, Thiem Van Chu, Daichi Fujiki, Susumu Takeuchi, Masato Motomura · PDF
Perception Loss Function Adaptive to Rate for Learned Video Compression
Sadaf Salehkalaibar, Buu Phan, João Atz Dick, Ashish J Khisti, Jun Chen, Wei Yu · PDF
PerCo (SD): Open Perceptual Compression
Nikolai Körber · PDF
Polar Codes for Channel Simulation
Sharang M. Sriramu, Rochelle Barsz, Elizabeth Polito, Aaron B. Wagner · PDF
Prechastic Coding: An Alternative Approach to Neural Network Description Lengths
Paris Dominic Louis Flood, Pietro Lio · PDF
QIANets: Quantum-Integrated Adaptive Networks for Reduced Latency and Improved Inference Times in CNN Models
Zhumazhan Balapanov, Vanessa Matvei, Olivia Holmberg, Edward Magongo, Kevin Zhu, Jonathan Pei · PDF
Randomly Pivoted V-optimal Design: Fast Data Selection under Low Intrinsic Dimension
Yijun Dong, Xiang Pan, Hoang Phan, Qi Lei · PDF
Sample Compression Hypernetworks: From Generalization Bounds to Meta-Learning
Benjamin Leblanc, Mathieu Bazinet, Nathaniel D'Amours, Alexandre Drouin, Pascal Germain · PDF
Sample compression unleashed : New generalization bounds for real valued losses
Mathieu Bazinet, Valentina Zantedeschi, Pascal Germain · PDF
SEED: Accelerating Reasoning Tree Construction via Scheduled Speculative Decoding
Zhenglin Wang, Jialong Wu, Yilong Lai, Congzhi Zhang, Deyu Zhou · PDF
Self-Data Distillation for Recovering Quality in Pruned Large Language Models
Vithursan Thangarasa, Ganesh Venkatesh, Nish Sinnadurai, Sean Lie · PDF
Shrinking the Size of Deep Extreme Multi-Label Classification
Marco Bornstein, Tahseen Rabbani, Brian Joseph Gravelle, Furong Huang · PDF
Simple LLM Compression Recovery Using Dynamic Prompting with Theoretical Analysis
Duc N.M Hoang, Minsik Cho, Thomas Merth, Mohammad Rastegari, Zhangyang Wang · PDF
SNeRV: Scalable Neural Representations for Video Coding
Yiying Wei, Hadi Amirpour, Christian Timmerer · PDF
SpikingVTG: Saliency Feedback Gating Enabled Spiking Video Temporal Grounding
Malyaban Bal, Brian Matejek, Susmit Jha, Adam D. Cobb · PDF
Sustainable AI: Efficient Pruning of Large Language Models in Resource-Limited Environments
Ashhadul Islam, SAMIR BELHAOUARI, Amine Bermak · PDF
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
Makoto Shing, Kou Misaki, Han Bao, Sho Yokoi, Takuya Akiba · PDF
The Rate-Distortion-Perception Trade-Off with Algorithmic Realism
Yassine Hamdi, Aaron B. Wagner, Deniz Gunduz · PDF
The Trichromatic Strong Lottery Ticket Hypothesis: Neural Compression With Three Primary Supermasks
Ángel López García-Arias, Yasuyuki Okoshi, Hikari Otsuka, Daiki Chijiwa, Yasuhiro Fujiwara, Susumu Takeuchi, Masato Motomura · PDF
Towards Scalable Compression with Universally Quantized Diffusion Models
Yibo Yang, Justus Will, Stephan Mandt · PDF
Training Block-wise Sparse Models Using Kronecker Product Decomposition
Ding Zhu, Zhiqun Zuo, Mohammad Mahdi Khalili · PDF
Training-Free Visual Token Compression via Delayed Spatial Merging
Jung Hwan Heo, Seyedarmin Azizi, Arash Fayyazi, Massoud Pedram · PDF
Transformers Learn to Compress Variable-order Markov Chains in-Context
Ruida Zhou, Chao Tian, Suhas Diggavi · PDF
Unified Lookup Tables: Privacy-Preserving Foundation Models
Nikita Janakarajan, Irina Espejo Morales, Marvin Alberts, Andrea Giovannini, Matteo Manica, Antonio Foncubierta-Rodríguez · PDF
Unifying Subsampling Pattern Variations for Compressed Sensing MRI with Neural Operators
Armeet Singh Jatyani, Jiayun Wang, Zihui Wu, Miguel Liu-Schiaffini, Bahareh Tolooshams, Anima Anandkumar · PDF
Vector Quantization with Sorting Transformation
Hongzhi Wang, Tanveer Syeda-mahmood · PDF
VRVQ: Variable Bitrate Residual Vector Quantization for Audio Compression
Yunkee Chae, Woosung Choi, Yuhta Takida, Junghyun Koo, Yukara Ikemiya, Zhi Zhong, Kin Wai Cheuk, Marco A. Martínez-Ramírez, Kyogu Lee, Wei-Hsiang Liao, Yuki Mitsufuji · PDF
Wasserstein Distortion with Intrinsic $\sigma$-Maps
Yang Qiu, Ziyuan Lin, Aaron B. Wagner · PDF
Weight-Sharing Method for Upsampling Layer from Feature Embedding Recursive Block
Jinwoo Hyun, YunKyong Hyon, Mira Lee, Sunju Lee, Taeyoung Ha, Young Rock Kim · PDF
What Makes for Good Image Captions?
Delong Chen, Samuel Cahyawijaya, Etsuko Ishii, Ho Shu Chan, Yejin Bang, Pascale Fung · PDF

Accepted papers (95)

☆A Theory for Compressibility of Graph Transformers for Transductive Learning

☆A Tighter Complexity Analysis of SparseGPT

☆Accelerating Memory-Efficient LLM Training and Fine-Tuning via Tracking the Gradient Subspace

☆Adapting Language Models via Token Translation

☆Adaptive Quantization and Pruning of Deep Neural Networks via Layer Importance Estimation

☆AdaQuantLM: LLM Quantization with Adaptive Bit-Widths

☆An image to tailor: I-Frame Domain Adaptation in Neural Video Compression

☆An Information Theory of Compute-Optimal Size Scaling, Emergence, and Plateaus in Language Models

☆Benchmarking neural lossless compression algorithms on multi-purpose astronomical image data

☆BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models

☆Breaking Smoothness: The Struggles of Neural Compressors with Discontinuous Mappings

☆Bridging the Gap between Diffusion Models and Universal Quantization for Image Compression

☆CDQuant: Greedy Coordinate Descent for Accurate LLM Quantization

☆Communication Compression for Tensor Parallel LLM Inference

☆Compressing Recurrent Neural Networks for FPGA-accelerated Implementation in Fluorescence Lifetime Imaging

☆Conditional Hallucinations for Image Compression

☆Copula-based Estimation of Continuous Sources for a Class of Constrained Rate-Distortion Functions

☆Deep Clustering with Associative Memories

☆Dense Backpropagation Improves Routing for Sparsely-Gated Mixture-of-Experts

☆Differentiable Attention

☆Diffusion Models With Learned Adaptive Noise

☆Distillation of Discrete Diffusion through Dimensional Correlations

☆Does Representation Matter? Exploring Intermediate Layers in Large Language Models

☆EAMQ: Environment-based Adaptive Model Quantization on Federated Reinforcement Learning

☆Efficient and Robust Spike Ensemble Coding of Signals

☆Efficient Compression of Sparse Accelerator Data Using Implicit Neural Representations and Importance Sampling

☆Efficient Model Compression Techniques with FishLeg

☆Empirical Upper Bounds for Unstructured Sparsity in Compute-Efficient Language Modeling

☆EXAQ: Exponent Aware Quantization For LLMs Acceleration

☆Exploiting Temporal Priors for Efficient Real-time Compression and Feedback of Wireless Channels

☆FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models

☆Flexible image decoding in learned image compression

☆Formalizing Limits of Knowledge Distillation Using Partial Information Decomposition

☆Fused-Layer CNNs for Memory-Efficient Inference on Microcontrollers

☆FV-NeRV: Neural Compression for Free Viewpoint Videos

☆Getting free Bits Back from Rotational Symmetries in LLMs

☆Graph Transformation Augmentation for Contrastive Learning of Graph-Level Representation: An Initial Exploration

☆Grow to Compress? Efficient Training of Robust Networks on the Edge

☆How Many Does It Take to Prune a Network: Comparing One-Shot vs. Iterative Pruning Regimes

☆Improving Knowledge Distillation with Teacher's Explanation

☆Information-theoretic Generalization Analysis for Vector-Quantized VAEs

☆Integration of Large Vision Models in Driver Monitoring Systems: Compressing and Distilling for Real-Time Automotive Applications

☆Interactions Across Blocks in Post-Training Quantization of Large Language Models

☆Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations

☆Large Language Model Compression with Neural Architecture Search

☆Latent Probabilistic Dataset Distillation with Theoretical Guarantees

☆Layer-Importance guided Adaptive Quantization for Efficient Speech Emotion Recognition

☆Layer-wise Quantization for Distributed Variational Inequalities

☆Learnable Fourier-based Activations for Implicit Signal Representations

☆Learning to Compress: Local Rank and Information Compression in Deep Neural Networks

☆LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and Quantization

☆LLM Vocabulary Compression for Low-Compute Environments

☆LORC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy

☆Losslessly Compressible Neural Network Parameters

☆LSH-E Tells You What To Discard: An Adaptive Locality-Sensitive Strategy for KV Cache Compression

☆M2M-TAG: Training-Free Many-to-Many Token Aggregation for Vision Transformer Acceleration

☆Majority Kernels: An Approach to Leverage Big Model Dynamics for Efficient Small Model Training

☆MAPLE: Memory-Aware Predict and Load for Efficient LLM Inference

☆MCUCoder: Adaptive Bitrate Learned Video Compression for IoT Devices

☆Mind the Gap Between Synthetic and Real: Probing Transfer Capabilities of Stable Diffusion Images

☆Neural Compression for Multispectral Satellite Images

☆Neural Normalized Compression Distance and the Disconnect Between Compression and Classification

☆Non-interactive Remote Coordination

☆On the Relationship Between Model Training Dynamics and Early Pruning Periods

☆P-SpikeSSM: Harnessing Probabilistic Spiking State Space Models for Long-Range Dependency Tasks

☆Partially Frozen Random Networks Contain Compact Strong Lottery Tickets

☆Perception Loss Function Adaptive to Rate for Learned Video Compression

☆PerCo (SD): Open Perceptual Compression

☆Polar Codes for Channel Simulation

☆Prechastic Coding: An Alternative Approach to Neural Network Description Lengths

☆QIANets: Quantum-Integrated Adaptive Networks for Reduced Latency and Improved Inference Times in CNN Models

☆Randomly Pivoted V-optimal Design: Fast Data Selection under Low Intrinsic Dimension

☆Sample Compression Hypernetworks: From Generalization Bounds to Meta-Learning

☆Sample compression unleashed : New generalization bounds for real valued losses

☆SEED: Accelerating Reasoning Tree Construction via Scheduled Speculative Decoding

☆Self-Data Distillation for Recovering Quality in Pruned Large Language Models

☆Shrinking the Size of Deep Extreme Multi-Label Classification

☆Simple LLM Compression Recovery Using Dynamic Prompting with Theoretical Analysis

☆SNeRV: Scalable Neural Representations for Video Coding

A Theory for Compressibility of Graph Transformers for Transductive Learning

A Tighter Complexity Analysis of SparseGPT

Accelerating Memory-Efficient LLM Training and Fine-Tuning via Tracking the Gradient Subspace

Adapting Language Models via Token Translation

Adaptive Quantization and Pruning of Deep Neural Networks via Layer Importance Estimation

AdaQuantLM: LLM Quantization with Adaptive Bit-Widths

An image to tailor: I-Frame Domain Adaptation in Neural Video Compression

An Information Theory of Compute-Optimal Size Scaling, Emergence, and Plateaus in Language Models

Benchmarking neural lossless compression algorithms on multi-purpose astronomical image data

BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models

Breaking Smoothness: The Struggles of Neural Compressors with Discontinuous Mappings

Bridging the Gap between Diffusion Models and Universal Quantization for Image Compression

CDQuant: Greedy Coordinate Descent for Accurate LLM Quantization

Communication Compression for Tensor Parallel LLM Inference

Compressing Recurrent Neural Networks for FPGA-accelerated Implementation in Fluorescence Lifetime Imaging

Conditional Hallucinations for Image Compression

Copula-based Estimation of Continuous Sources for a Class of Constrained Rate-Distortion Functions

Deep Clustering with Associative Memories

Dense Backpropagation Improves Routing for Sparsely-Gated Mixture-of-Experts

Differentiable Attention

Diffusion Models With Learned Adaptive Noise

Distillation of Discrete Diffusion through Dimensional Correlations

Does Representation Matter? Exploring Intermediate Layers in Large Language Models

EAMQ: Environment-based Adaptive Model Quantization on Federated Reinforcement Learning

Efficient and Robust Spike Ensemble Coding of Signals

Efficient Compression of Sparse Accelerator Data Using Implicit Neural Representations and Importance Sampling

Efficient Model Compression Techniques with FishLeg

Empirical Upper Bounds for Unstructured Sparsity in Compute-Efficient Language Modeling

EXAQ: Exponent Aware Quantization For LLMs Acceleration

Exploiting Temporal Priors for Efficient Real-time Compression and Feedback of Wireless Channels

FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models

Flexible image decoding in learned image compression

Formalizing Limits of Knowledge Distillation Using Partial Information Decomposition

Fused-Layer CNNs for Memory-Efficient Inference on Microcontrollers

FV-NeRV: Neural Compression for Free Viewpoint Videos

Getting free Bits Back from Rotational Symmetries in LLMs

Graph Transformation Augmentation for Contrastive Learning of Graph-Level Representation: An Initial Exploration

Grow to Compress? Efficient Training of Robust Networks on the Edge

How Many Does It Take to Prune a Network: Comparing One-Shot vs. Iterative Pruning Regimes

Improving Knowledge Distillation with Teacher's Explanation

Information-theoretic Generalization Analysis for Vector-Quantized VAEs

Integration of Large Vision Models in Driver Monitoring Systems: Compressing and Distilling for Real-Time Automotive Applications

Interactions Across Blocks in Post-Training Quantization of Large Language Models

Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations

Large Language Model Compression with Neural Architecture Search

Latent Probabilistic Dataset Distillation with Theoretical Guarantees

Layer-Importance guided Adaptive Quantization for Efficient Speech Emotion Recognition

Layer-wise Quantization for Distributed Variational Inequalities

Learnable Fourier-based Activations for Implicit Signal Representations

Learning to Compress: Local Rank and Information Compression in Deep Neural Networks

LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and Quantization

LLM Vocabulary Compression for Low-Compute Environments

LORC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy

Losslessly Compressible Neural Network Parameters

LSH-E Tells You What To Discard: An Adaptive Locality-Sensitive Strategy for KV Cache Compression

M2M-TAG: Training-Free Many-to-Many Token Aggregation for Vision Transformer Acceleration

Majority Kernels: An Approach to Leverage Big Model Dynamics for Efficient Small Model Training

MAPLE: Memory-Aware Predict and Load for Efficient LLM Inference

MCUCoder: Adaptive Bitrate Learned Video Compression for IoT Devices

Mind the Gap Between Synthetic and Real: Probing Transfer Capabilities of Stable Diffusion Images

Neural Compression for Multispectral Satellite Images

Neural Normalized Compression Distance and the Disconnect Between Compression and Classification

Non-interactive Remote Coordination

On the Relationship Between Model Training Dynamics and Early Pruning Periods

P-SpikeSSM: Harnessing Probabilistic Spiking State Space Models for Long-Range Dependency Tasks

Partially Frozen Random Networks Contain Compact Strong Lottery Tickets

Perception Loss Function Adaptive to Rate for Learned Video Compression

PerCo (SD): Open Perceptual Compression

Polar Codes for Channel Simulation

Prechastic Coding: An Alternative Approach to Neural Network Description Lengths

QIANets: Quantum-Integrated Adaptive Networks for Reduced Latency and Improved Inference Times in CNN Models

Randomly Pivoted V-optimal Design: Fast Data Selection under Low Intrinsic Dimension

Sample Compression Hypernetworks: From Generalization Bounds to Meta-Learning

Sample compression unleashed : New generalization bounds for real valued losses

SEED: Accelerating Reasoning Tree Construction via Scheduled Speculative Decoding

Self-Data Distillation for Recovering Quality in Pruned Large Language Models

Shrinking the Size of Deep Extreme Multi-Label Classification

Simple LLM Compression Recovery Using Dynamic Prompting with Theoretical Analysis

SNeRV: Scalable Neural Representations for Video Coding

SpikingVTG: Saliency Feedback Gating Enabled Spiking Video Temporal Grounding