By clicking "Accept", you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. See our Privacy Policy for more information
Open Datasets
OpenMathReasoning
Text

OpenMathReasoning

A comprehensive corpus for advanced mathematical resolution, combining reasoning chains, generation selection, and integrated inference tools.

Download dataset
Size

3.2M CoT solutions, 1.7M TIR solutions, 566K GenSelect, 193K statements alone; textual data structured in JSON

Licence

CC-BY 4.0

Description

OpenMathReasoning is a large-scale mathematical reasoning dataset designed to train language models to solve complex problems from AoPs forums. It includes more than 306,000 unique statements, with several million solutions generated using various strategies: thought chains (CoT), reasoning with integrated tools (TIR), and automatic selection of the best answers (GenSelect). The dataset is structured, validated and accompanied by rich metadata (generator model, success rate, etc.).

What is this dataset for?

  • Train efficient mathematical reasoning models capable of solving Olympic-level problems
  • Test various approaches: CoT, TIR, majority vote, etc.
  • Optimize the training of LLMs specialized in STEM or educational applications

Can it be enriched or improved?

Yes, it is possible to add human annotations for the responses generated, to integrate other mathematical corpora (e.g. MATH, miniF2F), or to structure the problems by theme or level. The dataset can also be used as a basis for new benchmarks or for training models in other languages with adapted translation.

🔎 In summary

Criterion Evaluation
🧩Ease of Use ⭐⭐⭐☆☆ (Rich data but technical to handle)
🧼Need for Cleaning ⭐⭐⭐⭐☆ (Low – High quality, well formatted)
🏷️Annotation Richness ⭐⭐⭐⭐⭐ (Exceptional: CoT, TIR, selection, success rate)
📜Commercial License ✅ Yes (CC-BY 4.0)
👨‍💻Beginner-Friendly ❌ Not really – High mathematical complexity
🔁Reusable for Fine-Tuning 🔥 Excellent for SFT, RLHF, distillation
🌍Cultural Diversity ⚠️ Low – Problems drawn from a single English-speaking corpus

🧠 Recommended for

  • Mathematical AI researchers
  • LLM STEM developers
  • Educational AI competitions

🔧 Compatible tools

  • PyTorch
  • Hugging Face
  • DeepSpeed
  • Transformers, VllM

💡 Tip

Filter problems by difficulty or success rate to better tailor the training to the ability of the model.

Frequently Asked Questions

Does the dataset cover all types of math problems?

It covers a wide variety, but mostly from AOPs forums. The standard problems are adapted to competitions and advanced reasoning.

Can we filter the data according to the type of reasoning used?

Yes, each example indicates the mode of inference: CoT (chain of thought), TIR (with tools) or GenSelect (response selection).

Is it suitable for fine-tuning without high-end GPUs?

Better exploited with powerful resources, but some subsets can be used with quantization or LoRa.

Similar datasets

See more
Category

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique.

Category

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique.

Category

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique.