By clicking "Accept", you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. See our Privacy Policy for more information
Open Datasets
MM-IMDb (Multimodal IMDb Dataset)
Multimodal

MM-IMDb (Multimodal IMDb Dataset)

mm-IMDb (Multimodal IMDb) is a multimodal dataset combining textual information (movie summaries), images (movie posters), and genre labels. It is designed for training and evaluating models capable of dealing with several modalities in parallel, in classification, recommendation or generation tasks.

Download dataset
Size

Over 25,000 movies, with textual metadata, posters (images) and multi-label labels (genres)

Licence

Free use for academic research, under MIT license

Description


For each movie, the dataset includes:

  • A textual summary (IMDb synopsis)
  • A poster in image (JPEG)
  • A list of genres (up to 23 possible genres: drama, action, comedy, etc.)
  • Metadata: title, date, duration, etc.

The dataset is structured to be used in multimodal approaches (text + image), with standardized splits for training, validation, and testing.

What is this dataset for?


mm-IMDb can be used for:

  • Training multimodal classification models (poster + synopsis → genres)
  • The development of film recommendation systems
  • The fusion of text/image representations (multi-embedding)
  • Analysis of the respective contribution of text and image to classification
  • The validation of architectures such as CLIP, ViLT, or multimodal BERT

Can it be enriched or improved?


Yes:

  • Add information about the cast, awards, or reviews
  • Complete images with scene captures (frames)
  • Introduce audio features for tri-modal analysis
  • Improving labels via crowdsourcing or more recent re-labeling models

🔗 Source: MM-IMDb Dataset on GitHub

Frequently Asked Questions

Can the dataset be used to test CLIP or BLIP?

Yes, it is an excellent benchmark for testing vision-language models on the classification or semantic alignment task.

Are the images of consistent quality?

The posters are automatically extracted from IMDb. Some may be of varying quality, but they remain generally clean and usable.

Is the dataset multilingual?

No Synopses are in English only.

Similar datasets

See more
Category

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique.

Category

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique.

Category

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique.