Clothing Fit Dataset for Size Recommendation
Enriched customer feedback dataset to predict if a size is right, too small, or too big. Includes notes, reviews, measurements, categories.
82,790 entries in JSON format (40 MB), structured customer-product data with text
CC BY 4.0
Description
Clothing Fit Dataset for Size Recommendation brings together more than 82,000 customer reviews concerning clothing from two major e-commerce platforms. It contains information on ratings, text comments, customer and product measurements, and feedback on the fit (too small, perfect, too big). This rich corpus makes it possible to train models to improve the customer experience in online fashion.
What is this dataset for?
- Develop a customised recommendation system for e-commerce sites
- Building a “fit” classification model based on textual opinions
- Create automatic summary or feeling analysis models based on reviews
Can it be enriched or improved?
Yes. It is possible to cross-reference this dataset with demographic information or product images. Enrichments can also include the analysis of feelings, the linguistic standardization of reviews or the extension to other brands or regions. The JSON format allows easy handling and advanced preprocessing.
🔎 In summary
🧠 Recommended for
- Draft recommendations
- E-commerce startups
- Marketing analysis
🔧 Compatible tools
- Python (pandas, scikit-learn)
- TensorFlow
- Hugging Face
- LightGBM
💡 Tip
Consider grouping similar products together to smooth out the effects of sparsity before training.
Frequently Asked Questions
Does the dataset contain images or only text?
This dataset contains only structured and textual data, without images.
Are the sizes standardized across the dataset?
Yes, the sizes have been converted to a unified numerical scale to facilitate modeling.
Can this dataset be used to create a virtual shopping assistant?
Absolutely, it is well suited to training a recommendation model based on user experience.