Chinese Sentiment Analyze

Chinese dataset combining reviews from e-commerce and social publications (Weibo), useful for the automatic detection of feelings (positive, neutral, negative).

Download dataset

Size

Text data in Chinese (reviews + social networks), JSON/CSV format, 182762 examples

Licence

MIT

Description

‍

Chinese Sentiment Analyze is a data set combining two main sources: product reviews (Shopping Reviews) and messages from the Weibo platform. It is designed for the analysis of feelings in Chinese, allowing classification into categories such as positive, neutral, or negative.

‍

What is this dataset for?

‍

Training NLP models for the classification of feelings in Mandarin
Develop opinion analysis tools for commercial or social applications
Testing the robustness of multilingual models on everyday Chinese texts

‍

Can it be enriched or improved?

‍

Yes. We can complete this corpus with other areas of opinion (politics, movies, public services) or refine the labels of feelings (level of intensity, specific emotion). A parallel translation or a segmentation by theme would also reinforce the linguistic and application interest of the dataset.

‍

🔎 In summary

Criterion	Evaluation
🧩Ease of Use	⭐⭐⭐☆☆ (Data easy to load via Hugging Face)
🧼Cleaning Required	⭐⭐⭐☆☆ (Low — depends on splits, but data generally ready to use)
🏷️Annotation Richness	⭐⭐⭐☆☆ (Sentiment labeled — binary or ternary depending on version)
📜Commercial License	✅ Yes (MIT)
👨‍💻Ideal for Beginners	👩‍💻 Yes — great for getting started with sentiment analysis
🔁Reusable for Fine-tuning	🔥 Perfect for fine-tuning a Chinese BERT classifier
🌍Cultural Diversity	🌏 Good — data from authentic Chinese platforms