By clicking "Accept", you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. See our Privacy Policy for more information
Open Datasets
Women's E-Commerce Clothing Reviews
Text

Women's E-Commerce Clothing Reviews

This dataset contains customer reviews of apparel products, including free text, ratings, recommendations, ages, and other information. It allows you to work on NLP issues, opinion classification, or even the analysis of purchasing behavior.

Download dataset
Size

23,486 rows in CSV format, text and categorical data

Licence

CC0: Public Domain

Description

The dataset Women's E-Commerce Clothing Reviews contains 23,486 reviews written by customers on clothes purchased online. Each line corresponds to customer feedback including information such as the rating awarded, age, summary, review text, and product recommendation information. All data is anonymized, with brand references removed.

What is this dataset for?

  • Train models for sentiment analysis or review classification
  • Conduct customer experience studies by age or product category
  • Explore NLP approaches like BERT, TF-IDF, or Word2Vec on real data

Can it be enriched or improved?

Yes, for example, you can cross these reviews with external data (price, stock, returns), generate additional labels (positive, neutral, negative) from the text, or even translate and adapt the data to other languages for multilingual use. The addition of lexical preprocessing also improves model performance.

🔎 In summary

Criterion Evaluation
🧩Ease of use ⭐⭐⭐⭐☆ (Very accessible, clear tabular format)
🧼Cleaning required ⭐⭐☆☆☆ (Low to moderate: standardize texts, remove duplicates)
🏷️Annotation richness ⭐⭐⭐⭐☆ (Good variety: rating, recommendation, age, free text)
📜Commercial license ✅ Yes (CC0)
👨‍💻Ideal for beginners 👩‍💻 Yes, perfect to start with NLP
🔁Reusable for fine-tuning 🔥 Yes, for models like BERT or RoBERTa
🌍Cultural diversity 🌐 Moderate – geographical origin not specified

🧠 Recommended for

  • Marketing analysts
  • NLP specialists
  • Recommendation system developers

🔧 Compatible tools

  • Hugging Face Transformers
  • Scikit-learn
  • SpacY
  • NLTK

💡 Tip

To improve the detection of feelings, combine the binary recommendation score with the semantic analysis of the text.

Frequently Asked Questions

Can this dataset be used to train a recommendation model?

Yes, rating and recommendation variables and product characteristics make it possible to model suggestion systems.

Does the text of the reviews contain brand names or company names?

No, all mentions have been anonymized and replaced by “retailer”.

Is it suitable for multilingual analysis?

No, the dataset is in English only, but it can be translated or enriched for multilingual analysis.

Similar datasets

See more
Category

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique.

Category

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique.

Category

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique.