By clicking "Accept", you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. See our Privacy Policy for more information
Image

CelebA

CelebA (CelebFaces Attributes Dataset) is an iconic Computer Vision dataset, centered on human faces. It is widely used in the fields of facial recognition, image generation, and facial attribute analysis, thanks to the richness of its annotations.

Download dataset
Size

Over 200,000 face images in JPEG format, annotations in TXT files

Licence

Free for academic use under specific conditions of the CelEBA license

Description


The CelEBA dataset includes:

  • 202,599 JPEG images of celebrity faces
  • 40 annotated attributes per image
  • 5 landmarks per face for facial alignment
  • Binary segmentation masks in the Celebamask-HQ version

CelebA is recognized for the diversity of faces represented, in terms of traits, ages and accessories, making it a resource of choice for training robust and generalizable models.

What is this dataset for?


CelebA is commonly used for:

  • Training facial recognition models
  • Analysis and classification of facial attributes
  • Training GaNS (Generative Adversarial Networks) for the generation of synthetic face images
  • The evaluation of detection models or modification of attributes (add a smile, remove glasses, etc.)

Can it be enriched or improved?


Yes, CelEBA can be improved in a number of ways:

  • By adding new attributes specific to certain populations or cultural expressions
  • By combining with other face datasets to improve demographic diversity
  • By refining segmentation masks for more precise processing tasks
  • By integrating CelEBA into multimodal pipelines (voice + image, text + image) for wider applications

🔗 Source: CelEBA Dataset

Frequently Asked Questions

Can I use CelEBA to test face generation models?

Yes, CelEBA is ideal for that. It is used as a reference for training or testing GaNS, due to the quality and variety of faces.

How to manage the biases present in this dataset?

CelebA has been criticized for an unbalanced representation of certain ethnic origins or genders. To limit bias, it is recommended to supplement it with other more representative data sets or to adjust the weights during training.

Is there a version with segmentation masks?

Yes, the Celebamask-HQ version includes high-quality segmentation annotations to train models on fine facial segmentation tasks.

Similar datasets

See more
Category

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique.

Category

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique.

Category

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique.