ICLR 2025 Past Other

Will Synthetic Data Finally Solve the Data Access Problem?

ICLR 2025 Workshop SynthData

Submission deadline
Feb 6, 2025, 23:59 UTC
imported from OpenReview — check the website for extensions
Submission portal
OpenReview
Notes
Auto-imported from the OpenReview venue record on 2026-06-10 — please verify and enrich (topics are keyword-guessed).

Accepted papers (42)

Fetched from OpenReview (v2) on 2026-06-10.

  1. [Tiny] Parameterized Synthetic Text Generation with SimpleStories

    Lennart Finke, Thomas Dooms, Mat Allen, Juan Diego Rodriguez, Noa Nabeshima, Dan Braun · PDF
  2. [Tiny] Synthetic-based retrieval of patient medical data

    Rinat Mullahmetov, Ilya Pershin · PDF
  3. [Tiny] Understanding the Impact of Data Domain Extraction on Synthetic Data Privacy

    Georgi Ganev, Meenatchi Sundaram Muthu Selva Annamalai, Sofiane Mahiou, Emiliano De Cristofaro · PDF
  4. Accelerating Differentially Private Federated Learning via Adaptive Extrapolation

    Shokichi Takakura, Seng Pei Liew, Satoshi Hasegawa · PDF
  5. AN OPTIMAL CRITERION FOR STEERING DATA DISTRIBUTIONS TO ACHIEVE EXACT FAIRNESS

    mohit sharma, Amit Deshpande, Chiranjib Bhattacharyya, Rajiv Ratn Shah · PDF
  6. Augmented Conditioning Is Enough For Effective Training Image Generation

    Jiahui Chen, Amy Zhang, Adriana Romero-Soriano · PDF
  7. Benchmarking Differentially Private Tabular Data Synthesis Algorithms

    Kai Chen, Xiaochen Li, Chen GONG, Ryan McKenna, Tianhao Wang · PDF
  8. Breaking Focus: Contextual Distraction Curse in Large Language Models

    Yanbo Wang, Zixiang Xu, Yue Huang, Chujie Gao, Siyuan Wu, Jiayi Ye, Xiuying Chen, Pin-Yu Chen, Xiangliang Zhang · PDF
  9. Can LLMs Replace Economic Choice Prediction Labs? The Case of Language-based Persuasion Games

    Eilam Shapira, Omer Madmon, Roi Reichart, Moshe Tennenholtz · PDF
  10. Can Transformers Learn Full Bayesian Inference In Context?

    Arik Reuter, Tim G. J. Rudner, Vincent Fortuin, David Rügamer · PDF
  11. Compositional World Knowledge leads to High Utility Synthetic data

    Sachit Gaudi, Gautam Sreekumar, Vishnu Boddeti · PDF
  12. Deconstructing Bias: A Multifaceted Framework for Diagnosing Cultural and Compositional Inequities in Text-to-Image Generative Models

    Muna Numan Said, Aarib Zaidi, Rabia Usman, Sonia Okon, Praneeth Medepalli, Kevin Zhu, Vasu Sharma, Sean O'Brien · PDF
  13. Did You Hear That? Introducing AADG: A Framework for Generating Benchmark Data in Audio Anomaly Detection

    Ksheeraja Raghavan, Samiran Gode, Ankit Shah, Surabhi Raghavan, Wolfram Burgard, Bhiksha Raj, Rita Singh · PDF
  14. DIET-PATE: Knowledge Transfer in PATE without Public Data

    Michel Meintz, Adam Dziedzic, Franziska Boenisch · PDF
  15. Differentially Private Synthetic Data via APIs 3: Using Simulators Instead of Foundation Model

    Zinan Lin, Tadas Baltrusaitis, Sergey Yekhanin · PDF
  16. Efficient Randomized Experiments Using Foundation Models

    Piersilvio De Bartolomeis, Javier Abad, Guanbo Wang, Konstantin Donhauser, Raymond M Duch, Fanny Yang, Issa Dahabreh · PDF
  17. Empowering LLMs in Decision Games through Algorithmic Data Synthesis

    Haolin Wang, Xueyan Li, Yazhe Niu, Shuai Hu, Hongsheng Li · PDF
  18. Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions

    Jiarui Zhang, Ollie Liu, Tianyu Yu, Jinyi Hu, Willie Neiswanger · PDF
  19. Evaluating Inter-Column Logical Relationships in Synthetic Tabular Data Generation

    Yunbo Long, Liming Xu, Alexandra Brintrup · PDF
  20. Grounding QA Generation in Knowledge Graphs and Literature: A Scalable LLM Framework for Scientific Discovery

    Marc Boubnovski Martell, Kaspar Märtens, Lawrence Phillips, Daniel Keitley, Maria Dermit, Julien Fauqueur · PDF
  21. How Well Does Your Tabular Generator Learn the Structure of Tabular Data?

    Xiangjian Jiang, Nikola Simidjievski, Mateja Jamnik · PDF
  22. Human-like compositional learning of visually-grounded concepts using synthetic data

    Zijun Lin, M Ganesh Kumar, Cheston Tan · PDF
  23. Improved Density Ratio Estimation for Evaluating Synthetic Data Quality

    Lukas Gruber, Markus Holzleitner, Sepp Hochreiter, Werner Zellinger · PDF
  24. Is API Access to LLMs Useful for Generating Private Synthetic Tabular Data?

    Marika Swanberg, Ryan McKenna, Edo Roth, Albert Cheu, Peter Kairouz · PDF
  25. LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation

    Mufei Li, Viraj Shitole, Eli Chien, Changhai Man, Zhaodong Wang, Srinivas, Ying Zhang, Tushar Krishna, Pan Li · PDF
  26. Leveraging Vertical Public-Private Split for Improved Synthetic Data Generation

    Samuel Maddock, Shripad Gade, Graham Cormode, Will Bullock · PDF
  27. Orchestrating Synthetic Data with Reasoning

    Tim R. Davidson, Benoit Seguin, Enrico Bacis, Cesar Ilharco, Hamza Harkous · PDF
  28. Out-of-Distribution Detection using Synthetic Data Generation

    Momin Abbas, Muneeza Azmat, Raya Horesh, Mikhail Yurochkin · PDF
  29. Private Federated Learning using Preference-Optimized Synthetic Data

    Charlie Hou, Mei-Yu Wang, Yige Zhu, Daniel Lazar, Giulia Fanti · PDF
  30. SoftSRV: Learn to generate targeted synthetic data.

    Giulia DeSalvo, Jean-François Kagy, Lazaros Karydas, Afshin Rostamizadeh, Sanjiv Kumar · PDF
  31. Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources

    Alisia Maria Lupidi, Carlos Gemmell, Nicola Cancedda, Jane Yu, Jason E Weston, Jakob Nicolaus Foerster, Roberta Raileanu, Maria Lomeli · PDF
  32. Stronger Models are NOT Always Stronger Teachers for Instruction Tuning

    Zhangchen Xu, Fengqing Jiang, Luyao Niu, Bill Yuchen Lin, Radha Poovendran · PDF
  33. SyntheRela: A Benchmark For Synthetic Relational Database Generation

    Martin Jurkovic, Valter Hudovernik, Erik Štrumbelj · PDF
  34. Synthetic Data for Blood Vessel Network Extraction

    Joël Mathys, Andreas Plesner, Jorel Elmiger, Roger Wattenhofer · PDF
  35. Synthetic Data Pruning in High Dimensions: A Random Matrix Perspective

    Aymane El Firdoussi, Mohamed El Amine Seddik, Soufiane Hayou, Reda ALAMI, Ahmed Alzubaidi, Hakim Hacid · PDF
  36. Synthetic Poisoning Attacks: The Impact of Poisoned MRI Image on U-Net Brain Tumor Segmentation

    Tianhao Li, Tianyu Zeng, Yujia Zheng, ZHANG CHULONG, Jingyu Lu, Haotian Huang, Chuangxin Chu, Fang-Fang Yin, Zhenyu Yang · PDF
  37. Text to 3D Object Generation for Scalable Room Assembly

    Sonia Laguna, Alberto Garcia-Garcia, Marie-Julie Rakotosaona, Stylianos Moschoglou, Leonhard Helminger, Sergio Orts-Escolano · PDF
  38. TIMER: Temporal Instruction Modeling and Evaluation for Longitudinal Clinical Records

    Hejie Cui, Alyssa Unell, Bowen Chen, Jason Alan Fries, Emily Alsentzer, Sanmi Koyejo, Nigam Shah · PDF
  39. Towards Internet-Scale Training For Agents

    Brandon Trabucco, Gunnar A Sigurdsson, Robinson Piramuthu, Ruslan Salakhutdinov · PDF
  40. Training-Free Safe Denoisers For Safe Use of Diffusion Models

    Mingyu Kim, Dongjun Kim, Amman Yusuf, Stefano Ermon, Mijung Park · PDF
  41. TRIG-Bench: A Benchmark for Text-Rich Image Grounding

    Ming Li, Ruiyi Zhang, Jian Chen, Tianyi Zhou · PDF
  42. V-LASIK: Consistent Glasses-Removal from Videos Using Synthetic Data

    Rotem Shalev Arkushin, Aharon Azulay, Tavi Halperin, Eitan Richardson, Amit Haim Bermano, Ohad Fried · PDF