CelebA
CelebA (CelebFaces Attributes Dataset) is an iconic Computer Vision dataset, centered on human faces. It is widely used in the fields of facial recognition, image generation, and facial attribute analysis, thanks to the richness of its annotations.
Over 200,000 face images in JPEG format, annotations in TXT files
Free for academic use under specific conditions of the CelEBA license
Description
The CelEBA dataset includes:
- 202,599 JPEG images of celebrity faces
- 40 annotated attributes per image
- 5 landmarks per face for facial alignment
- Binary segmentation masks in the Celebamask-HQ version
CelebA is recognized for the diversity of faces represented, in terms of traits, ages and accessories, making it a resource of choice for training robust and generalizable models.
What is this dataset for?
CelebA is commonly used for:
- Training facial recognition models
- Analysis and classification of facial attributes
- Training GaNS (Generative Adversarial Networks) for the generation of synthetic face images
- The evaluation of detection models or modification of attributes (add a smile, remove glasses, etc.)
Can it be enriched or improved?
Yes, CelEBA can be improved in a number of ways:
- By adding new attributes specific to certain populations or cultural expressions
- By combining with other face datasets to improve demographic diversity
- By refining segmentation masks for more precise processing tasks
- By integrating CelEBA into multimodal pipelines (voice + image, text + image) for wider applications
🔗 Source: CelEBA Dataset
Frequently Asked Questions
Can I use CelEBA to test face generation models?
Yes, CelEBA is ideal for that. It is used as a reference for training or testing GaNS, due to the quality and variety of faces.
How to manage the biases present in this dataset?
CelebA has been criticized for an unbalanced representation of certain ethnic origins or genders. To limit bias, it is recommended to supplement it with other more representative data sets or to adjust the weights during training.
Is there a version with segmentation masks?
Yes, the Celebamask-HQ version includes high-quality segmentation annotations to train models on fine facial segmentation tasks.