NeurIPS 2024PastOther

The Third Workshop on New Frontiers in Adversarial Machine Learning

AdvML-Frontiers 2024

Official website ↗OpenReview venue ↗See all NeurIPS workshops →✎ Edit this entry

Submission deadline: Aug 31, 2024, 13:05 UTC
imported from OpenReview — check the website for extensions
Submission portal: OpenReview
Notes: Topics were auto-suggested and may be imprecise — edits welcome.

Accepted papers (37)

Fetched from OpenReview (v2) on 2026-06-10.

Achieving Domain-Independent Certified Robustness via Knowledge Continuity
Alan Sun, Chiyu Ma, Kenneth Ge, Soroush Vosoughi · PDF
AdjointDEIS: Efficient Gradients for Diffusion Models
Zander W. Blasingame, Chen Liu · PDF
Advancing NLP Security by Leveraging LLMs as Adversarial Engines
Sudarshan Srinivasan, Maria Mahbub, Amir Sadovnik · PDF
Adversarial Bounding Boxes Generation (ABBG) Attack against Visual Object Trackers
Fatemeh Nourilenjan Nokabadi, Jean-Francois Lalonde, Christian Gagné · PDF
Adversarial Databases Improve Success in Retrieval-based Large Language Models
Sean Wu, Michael Koo, Li Yo Kao, Andy Black, Lesley Blum, Fabien Scalzo, Ira Kurtz · PDF
Adversarial Training based Domain Adaptation for Cross-Subject Emotion Recognition
Sungpil Woo, MUHAMMAD ZUBAIR, Sunhwan Lim, Daeyoung Kim · PDF
Adversarial Watermarking for Face Recognition
Yuguang Yao, Anil K. Jain, Sijia Liu · PDF
An Adversarial Learning Approach to Irregular Time-Series Forecasting
Heejeong Nam, Jihyun Kim, Jimin Yeom · PDF
Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?
Michael-Andrei Panaitescu-Liess, Zora Che, Bang An, Yuancheng Xu, Pankayaraj Pathmanathan, Souradip Chakraborty, Sicheng Zhu, Tom Goldstein, Furong Huang · PDF
Class Attribute Inference Attacks: Inferring Sensitive Class Information by Diffusion-Based Attribute Manipulations
Lukas Struppek, Dominik Hintersdorf, Felix Friedrich, Manuel Brack, Patrick Schramowski, Kristian Kersting · PDF
dSTAR: Straggler Tolerant and Byzantine Resilient Distributed SGD
Jiahe Yan, Pratik Chaudhari, Leonard Kleinrock · PDF
Ensemble everything everywhere: Multi-scale aggregation for adversarial robustness
Stanislav Fort, Balaji Lakshminarayanan · PDF
Hiding-in-Plain-Sight (HiPS) Attack on CLIP for Targetted Object Removal from Images
Arka Daw, Megan Hong-Thanh Chung, Maria Mahbub, Amir Sadovnik · PDF
Imitation Guided Automated Red Teaming
Sajad Mousavi, Desik Rengarajan, Ashwin Ramesh Babu, Vineet Gundecha, Antonio Guillen, Ricardo Luna Gutierrez, Avisek Naug, Sahand Ghorbanpour, Soumyendu Sarkar · PDF
In Search of the $\textit{Successful}$ Interpolation: On the Role of $\textit{Sharpness}$ in CLIP Generalization
Alireza Abdollahpourrostam · PDF
In-distribution adversarial attacks on object recognition models using gradient-free search.
Spandan Madan, Tomotake Sasaki, Tzu-Mao Li, Hanspeter Pfister, Xavier Boix · PDF
Jailbreak Defense in a Narrow Domain: Failures of existing methods and Improving Transcript-Based Classifiers
Tony Tong Wang, John Hughes, Henry Sleight, Rylan Schaeffer, Rajashree Agrawal, Fazl Barez, Mrinank Sharma, Jesse Mu, Nir N Shavit, Ethan Perez · PDF
Learning From Convolution-based Unlearnable Datasets
Dohyun Kim, Pedro Sandoval-Segura · PDF
Learning to Forget using Hypernetworks
Jose Miguel Lara Rangel, Usman Anwar, Stefan Schoepf, Jack Foster, David Krueger · PDF
LLM-PIRATE: A benchmark for indirect prompt injection attacks in Large Language Models
Anil Ramakrishna, Jimit Majmudar, Rahul Gupta, Devamanyu Hazarika · PDF
Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference
Anton Xue, Avishree Khare, Rajeev Alur, Surbhi Goel, Eric Wong · PDF
Moral Persuasion in Large Language Models: Evaluating Susceptibility and Ethical Alignment
Allison Huang, Carlos Mougan, Yulu Pi · PDF
Provable Robustness of (Graph) Neural Networks Against Data Poisoning and Backdoor Attacks
Lukas Gosch, Mahalakshmi Sabanayagam, Debarghya Ghoshdastidar, Stephan Günnemann · PDF
RenderAttack: Hundreds of Adversarial Attacks Through Differentiable Texture Generation
Dron Hazra, Alex Bie, Mantas Mazeika, Xuwang Yin, Andy Zou, Dan Hendrycks, Maximilian Kaufmann · PDF
Rethinking Backdoor Detection Evaluation for Language Models
Jun Yan, Wenjie Jacky Mo, Xiang Ren, Robin Jia · PDF
Rethinking Randomized Smoothing from the Perspective of Scalability
Sukrit Jindal, Devansh Bhardwaj, Anupriya Kumari · PDF
Robustness of Practical Perceptual Hashing Algorithms to Hash-Evasion and Hash-Inversion Attacks
Jordan Madden, Moxanki Bhavsar, Lhamo Dorje, Xiaohua Li · PDF
SkipOOD: Efficient Out-of-Distribution Input Detection using Skipping Mechanism
Mirazul Haque, Natraj Raman, Petr Babkin, Armineh Nourbakhsh, Xiaomo Liu · PDF
Smoothing-Based Adversarial Defense Methods for Inverse Problems
Yang Sun, Jonathan Scarlett · PDF
Sparse patches adversarial attacks via extrapolating point-wise information
Yaniv Nemcovsky, Avi Mendelson, Chaim Baskin · PDF
Sparse Transfer Learning Accelerates and Enhances Certified Robustness: A Comprehensive Study
Zhangheng LI, Tianlong Chen, Linyi Li, Bo Li, Zhangyang Wang · PDF
The Ultimate Cookbook for Invisible Poison: Crafting Subtle Clean-Label Text Backdoors with Style Attributes
Wencong You, Daniel Lowd · PDF
Track 1: Robust Offline Learning via Adversarial World Models
Uljad Berdica, Kelvin Li, Michael Beukman, Alexander David Goldie, Perla Maiolino, Jakob Nicolaus Foerster · PDF
TrackPGD: Efficient Adversarial Attack using Object Binary Masks against Robust Transformer Trackers
Fatemeh Nourilenjan Nokabadi, Yann Batiste Pequignot, Jean-Francois Lalonde, Christian Gagné · PDF
Unveiling Synthetic Faces: How Synthetic Datasets Can Expose Real Identities
Hatef Otroshi Shahreza, Sébastien Marcel · PDF
vTune: Verifiable Fine-Tuning Through Backdooring
Eva Zhang, Akilesh Potti, Micah Goldblum · PDF
When Do Universal Image Jailbreaks Transfer Between Vision-Language Models?
Rylan Schaeffer, Dan Valentine, Luke Bailey, James Chua, Cristobal Eyzaguirre, Zane Durante, Joe Benton, Brando Miranda, Henry Sleight, Tony Tong Wang, John Hughes, Rajashree Agrawal, Mrinank Sharma, Scott Emmons, Sanmi Koyejo, Ethan Perez · PDF

Accepted papers (37)

☆Achieving Domain-Independent Certified Robustness via Knowledge Continuity

☆AdjointDEIS: Efficient Gradients for Diffusion Models

☆Advancing NLP Security by Leveraging LLMs as Adversarial Engines

☆Adversarial Bounding Boxes Generation (ABBG) Attack against Visual Object Trackers

☆Adversarial Databases Improve Success in Retrieval-based Large Language Models

☆Adversarial Training based Domain Adaptation for Cross-Subject Emotion Recognition

☆Adversarial Watermarking for Face Recognition

☆An Adversarial Learning Approach to Irregular Time-Series Forecasting

☆Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?

☆Class Attribute Inference Attacks: Inferring Sensitive Class Information by Diffusion-Based Attribute Manipulations

☆dSTAR: Straggler Tolerant and Byzantine Resilient Distributed SGD

☆Ensemble everything everywhere: Multi-scale aggregation for adversarial robustness

☆Hiding-in-Plain-Sight (HiPS) Attack on CLIP for Targetted Object Removal from Images

☆Imitation Guided Automated Red Teaming

☆In Search of the $\textit{Successful}$ Interpolation: On the Role of $\textit{Sharpness}$ in CLIP Generalization

☆In-distribution adversarial attacks on object recognition models using gradient-free search.

☆Jailbreak Defense in a Narrow Domain: Failures of existing methods and Improving Transcript-Based Classifiers

☆Learning From Convolution-based Unlearnable Datasets

☆Learning to Forget using Hypernetworks

☆LLM-PIRATE: A benchmark for indirect prompt injection attacks in Large Language Models

☆Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference

☆Moral Persuasion in Large Language Models: Evaluating Susceptibility and Ethical Alignment

☆Provable Robustness of (Graph) Neural Networks Against Data Poisoning and Backdoor Attacks

☆RenderAttack: Hundreds of Adversarial Attacks Through Differentiable Texture Generation

☆Rethinking Backdoor Detection Evaluation for Language Models

☆Rethinking Randomized Smoothing from the Perspective of Scalability

☆Robustness of Practical Perceptual Hashing Algorithms to Hash-Evasion and Hash-Inversion Attacks

☆SkipOOD: Efficient Out-of-Distribution Input Detection using Skipping Mechanism

☆Smoothing-Based Adversarial Defense Methods for Inverse Problems

☆Sparse patches adversarial attacks via extrapolating point-wise information

☆Sparse Transfer Learning Accelerates and Enhances Certified Robustness: A Comprehensive Study

☆The Ultimate Cookbook for Invisible Poison: Crafting Subtle Clean-Label Text Backdoors with Style Attributes

☆Track 1: Robust Offline Learning via Adversarial World Models

☆TrackPGD: Efficient Adversarial Attack using Object Binary Masks against Robust Transformer Trackers

☆Unveiling Synthetic Faces: How Synthetic Datasets Can Expose Real Identities

☆vTune: Verifiable Fine-Tuning Through Backdooring

☆When Do Universal Image Jailbreaks Transfer Between Vision-Language Models?

Achieving Domain-Independent Certified Robustness via Knowledge Continuity

AdjointDEIS: Efficient Gradients for Diffusion Models

Advancing NLP Security by Leveraging LLMs as Adversarial Engines

Adversarial Bounding Boxes Generation (ABBG) Attack against Visual Object Trackers

Adversarial Databases Improve Success in Retrieval-based Large Language Models

Adversarial Training based Domain Adaptation for Cross-Subject Emotion Recognition

Adversarial Watermarking for Face Recognition

An Adversarial Learning Approach to Irregular Time-Series Forecasting

Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?

Class Attribute Inference Attacks: Inferring Sensitive Class Information by Diffusion-Based Attribute Manipulations

dSTAR: Straggler Tolerant and Byzantine Resilient Distributed SGD

Ensemble everything everywhere: Multi-scale aggregation for adversarial robustness

Hiding-in-Plain-Sight (HiPS) Attack on CLIP for Targetted Object Removal from Images

Imitation Guided Automated Red Teaming

In Search of the $\textit{Successful}$ Interpolation: On the Role of $\textit{Sharpness}$ in CLIP Generalization

In-distribution adversarial attacks on object recognition models using gradient-free search.

Jailbreak Defense in a Narrow Domain: Failures of existing methods and Improving Transcript-Based Classifiers

Learning From Convolution-based Unlearnable Datasets

Learning to Forget using Hypernetworks

LLM-PIRATE: A benchmark for indirect prompt injection attacks in Large Language Models

Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference

Moral Persuasion in Large Language Models: Evaluating Susceptibility and Ethical Alignment

Provable Robustness of (Graph) Neural Networks Against Data Poisoning and Backdoor Attacks

RenderAttack: Hundreds of Adversarial Attacks Through Differentiable Texture Generation

Rethinking Backdoor Detection Evaluation for Language Models

Rethinking Randomized Smoothing from the Perspective of Scalability

Robustness of Practical Perceptual Hashing Algorithms to Hash-Evasion and Hash-Inversion Attacks

SkipOOD: Efficient Out-of-Distribution Input Detection using Skipping Mechanism

Smoothing-Based Adversarial Defense Methods for Inverse Problems

Sparse patches adversarial attacks via extrapolating point-wise information

Sparse Transfer Learning Accelerates and Enhances Certified Robustness: A Comprehensive Study

The Ultimate Cookbook for Invisible Poison: Crafting Subtle Clean-Label Text Backdoors with Style Attributes

Track 1: Robust Offline Learning via Adversarial World Models

TrackPGD: Efficient Adversarial Attack using Object Binary Masks against Robust Transformer Trackers

Unveiling Synthetic Faces: How Synthetic Datasets Can Expose Real Identities

vTune: Verifiable Fine-Tuning Through Backdooring

When Do Universal Image Jailbreaks Transfer Between Vision-Language Models?