LAION Art EN Improved Captions
LAION Art EN Improved Captions is a dataset of artistic images combined with improved English descriptions via a state-of-the-art model, designed to improve the semantic image-text relationship in image generation tasks.
Description
LAION Art EN Improved Captions contains over 2.6 million image-caption pairs in English, with descriptions generated and refined by an advanced model (Salesforce/blip2-flan-t5-XXL). This dataset makes it easy to fine-tune text-based image-generating models and create powerful prompt databases.
What is this dataset for?
- Fine-tuning text-to-image generators (ex: Stable Diffusion)
- Creation of searchable prompt databases for image generation
- Improving the semantic quality between images and descriptions
Can it be enriched or improved?
The dataset can be enriched by adding captions in other languages, or by manually correcting descriptions for specific cases. Advanced indexing (e.g. Faiss) allows a better search in the prompt database.
🔎 In summary
🧠 Recommended for
- Text-to-image template developers
- Researchers in vision and multimodal NLP
- Prompt database creators
🔧 Compatible tools
- Hugging Face Datasets
- Faiss
- PyTorch
- TensorFlow
- Stable Diffusion
💡 Tip
Use Faiss indexing to effectively exploit the prompt search in this dataset.
Frequently Asked Questions
What is the size of the LAION Art EN Improved Captions dataset?
Approximately 2.68 million image-caption pairs in English, totaling 442 MB of data.
Can this dataset be used for commercial projects?
Yes, the CC-BY 4.0 license allows commercial use under attribution.
Is this dataset suitable for fine-tuning text-to-image models like Stable Diffusion?
Yes, it was designed precisely to improve the quality of text-to-image generators.




