Circa - Interpreting indirect answers in conversation
The <strong>Circa</strong> dataset contains dialogues in English that focus on polar questions (yes/no) and their indirect answers. Exchanges are extracted from 10 distinct social situations and annotated by several annotators to interpret the indirect response.
Description
Circa is a linguistic corpus that helps to understand how to interpret indirect answers to closed-ended questions in various social contexts. Each example combines a polar question asked by one person (X) and an indirect answer given by another (Y), with multiple annotations indicating the likely interpretation.
What is this dataset for?
- Train NLP models to detect the implicit in indirect responses
- Studying conversational interactions in a social context
- Improving the understanding of virtual assistants in the face of non-explicit answers
Can it be enriched or improved?
Yes, the dataset can be extended by adding other social contexts, languages, or finer annotations on tone or emotion. Multilingual versions would also be beneficial.
🔎 In summary
🧠 Recommended for
- Conversational NLP researchers
- Virtual assistant developers
- Computational linguists
🔧 Compatible tools
- Hugging Face
- PyTorch
- TensorFlow
- SpacY
💡 Tip
Use multiple annotations to better calibrate the confidence of interpretations in the models.
Frequently Asked Questions
What type of questions does this dataset contain?
It mainly contains closed-ended questions (yes/no) asked in a variety of social situations.
How are indirect responses annotated?
Each answer is annotated by five annotators, with a majority to determine the primary interpretation.
Can the dataset be used for languages other than English?
Currently no, but it can be extended or adapted for other languages and social contexts.




