Awesome Dataset Distillation

Awesome Dataset Distillation

Awesome
Dataset Distillation

Awesome
Dataset
Distillation

Awesome Dataset Distillation provides the most comprehensive and detailed information on the Dataset Distillation field.

This project is curated and maintained by Guang Li, Bo Zhao, and Tongzhou Wang.

Awesome Dataset Distillation

GitHub Stars:


Paper Statistics:

Background & Vision

Dataset distillation is the task of synthesizing a small dataset such that models trained on it achieve high performance on the original large dataset. A dataset distillation algorithm takes as input a large real dataset to be distilled (training set), and outputs a small synthetic distilled dataset, which is evaluated via testing models trained on this distilled dataset on a separate real dataset (validation/test set). A good small distilled dataset is not only useful in dataset understanding, but has various applications (e.g., continual learning, privacy, neural architecture search, etc.). This task was first introduced in the paper Dataset Distillation [Tongzhou Wang et al., '18], along with a proposed algorithm using backpropagation through optimization steps. Then the task was first extended to the real-world datasets in the paper Medical Dataset Distillation [Guang Li et al., '19], which also explored the privacy preservation possibilities of dataset distillation. In the paper Dataset Condensation [Bo Zhao et al., '20], gradient matching was first introduced and greatly promoted the development of the dataset distillation field.


In recent years (2022-now), dataset distillation has gained increasing attention in the research community, across many institutes and labs. More papers are now being published each year. These wonderful researches have been constantly improving dataset distillation and exploring its various variants and applications.

Latest Updates

Content

Media Coverage

Beginning of Awesome Dataset Distillation

Most Popular AI Research Aug 2022

一个项目帮你了解数据集蒸馏Dataset Distillation

浓缩就是精华:用大一统视角看待数据集蒸馏


Awesome Dataset Distillation

If you find this project useful for your research, please use the following BibTeX entry:

@misc{li2022awesome, author={Li, Guang and Zhao, Bo and Wang, Tongzhou}, title={Awesome Dataset Distillation}, howpublished={\url{https://github.com/Guang000/Awesome-Dataset-Distillation}}, year={2022} }

This page has been viewed:

since 2024.03.22

Contribution Guide

If you want to make contribution to this project, click here for more details.