UCF101
UCF101 is an open source dataset that is a reference in the field of video analysis. It includes more than 13,000 clips representing various human actions such as running, jumping, cooking or playing sports. It is one of the most used benchmarks for training and evaluating action recognition models.
13320 videos classified into 101 categories of human actions, AVI format
Free for academic use, licensed under Creative Commons Attribution-NonCommercial 4.0 (CC BY-NC 4.0)
Description
The dataset contains:
- 13,320 short videos (about 7 seconds on average)
- 101 action classes (sports, daily actions, social interactions...)
- Videos from YouTube, with a realistic, unfiltered background
- 25 groups for a standardized division in training/testing
- Video data in AVI format, 320×240 pixels at 25 fps
Each video shows a single main action, making the supervised classification task easier.
What is this dataset for?
UCF101 is used for:
- Training human action recognition models (CNN 3D, RNN, Video Transformers)
- Validation of embedded vision systems (robots, security cameras, etc.)
- Pre-training video models that are then used to detect events
- Research on space-time processing architectures (SlowFast, TimesFormer, VideoMAE)
- Behavioral analysis in a general public or surveillance context
Can it be enriched or improved?
Yes, in particular via:
- The addition of finer annotations (multi-actions, exact timeframe)
- Conversion to HDF5 or TFRecord to speed up ingestion
- Training temporal segmentation or multi-label detection models
- Cross-referencing audio or text data for multimodal approaches
🔗 Source: UCF101 Dataset (official)
Frequently Asked Questions
Does UCF101 contain sound?
No, the videos are silent. Combining with other datasets like Kinetics is recommended if you are looking for an audio component.
Is the dataset suitable for real-time detection?
Partially. The videos are short and well-cut, which is great for classification. For real-time detection, adaptations or a dataset like ActivityNet are preferable.
Is there a newer or extended version?
Yes. The HMDB51 dataset is more difficult (fewer examples, more noise), and Kinetics-600/700 offers a larger volume for similar tasks.