Reddit Memes Dataset

Dataset composed of over 3,300 Reddit meme images, including image URLs, the number of upvotes and downvotes, and other metadata. Collected for computer vision projects and popularity analysis.

Download dataset

Size

3,327 image files (image URLs + associated JSON metadata)

Licence

CC0: Public Domain

Description

‍

The dataset Reddit Memes Dataset contains 3,327 meme images from Reddit, along with metadata such as post ID, number of upvotes and downvotes, and other relevant information. This corpus is a good starting point for computer vision projects related to the analysis of humorous and viral content.

‍

What is this dataset for?

‍

Training computer vision models for the classification of humorous images
Analyzing the popularity and engagement score of social media memes
Develop systems for recommending or moderating visual content

‍

Can it be enriched or improved?

‍

Yes, you can add manual annotations to the content of the memes, such as humorous categories, the type of meme, or the cultural context. It is also possible to integrate textual data extracted from images via OCR for multimodal analyses.

‍

🔎 In summary

Criterion	Evaluation
🧩 Ease of use	⭐⭐⭐⭐✩ (Images accessible via URL, easy to integrate)
🧼 Need for cleaning	⭐⭐⭐⭐⭐ (Low: structured metadata)
🏷️ Annotation richness	⭐⭐✩✩✩ (Basic: engagement metadata only)
📜 Commercial license	✅ Yes (CC0 Public Domain)
👨‍💻 Beginner friendly	🌟 Yes, perfect for introductory computer vision projects
🔁 Fine-tuning ready	🎯 Suitable for image classification and scoring
🌍 Cultural diversity	⚠️ Primarily English-speaking Internet culture