By clicking "Accept", you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. See our Privacy Policy for more information

Preferences Decline Accept

MM-IMDb (Multimodal IMDb Dataset)

Multimodal

MM-IMDb (Multimodal IMDb Dataset)

mm-IMDb (Multimodal IMDb) is a multimodal dataset combining textual information (movie summaries), images (movie posters), and genre labels. It is designed for training and evaluating models capable of dealing with several modalities in parallel, in classification, recommendation or generation tasks.

Download dataset

Size

Over 25,000 movies, with textual metadata, posters (images) and multi-label labels (genres)

Licence

Free use for academic research, under MIT license

Description

‍
For each movie, the dataset includes:

A textual summary (IMDb synopsis)
A poster in image (JPEG)
A list of genres (up to 23 possible genres: drama, action, comedy, etc.)
Metadata: title, date, duration, etc.

‍

The dataset is structured to be used in multimodal approaches (text + image), with standardized splits for training, validation, and testing.

‍

‍

What is this dataset for?

‍
mm-IMDb can be used for:

Training multimodal classification models (poster + synopsis → genres)
The development of film recommendation systems
The fusion of text/image representations (multi-embedding)
Analysis of the respective contribution of text and image to classification
The validation of architectures such as CLIP, ViLT, or multimodal BERT

‍

‍

Can it be enriched or improved?

‍
Yes:

Add information about the cast, awards, or reviews
Complete images with scene captures (frames)
Introduce audio features for tri-modal analysis
Improving labels via crowdsourcing or more recent re-labeling models

‍

‍

🔗 Source: MM-IMDb Dataset on GitHub

‍

Frequently Asked Questions

Can the dataset be used to test CLIP or BLIP?

Yes, it is an excellent benchmark for testing vision-language models on the classification or semantic alignment task.

Are the images of consistent quality?

The posters are automatically extracted from IMDb. Some may be of varying quality, but they remain generally clean and usable.

Is the dataset multilingual?

No Synopses are in English only.

Similar datasets

Open Payments Dataset

Cybersecurity Heimdall v1.1

OASIS (Open Access Series of Imaging Studies)

Copyright © Innovatiana SAS (SIREN 913 684 668), a French & Malagasy company, 2021-2025. All rights reserved

Terms of use Privacy Policy