By clicking "Accept", you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. See our Privacy Policy for more information
Open Datasets
TREC-QA Dataset
Text

TREC-QA Dataset

TREC-QA is a dataset designed for training and evaluating natural language question-answer (QA) models. It comes from the TREC (Text Retrieval Conference) conferences and aims to test the ability of systems to provide accurate answers to factual questions from a corpus of documents.

Download dataset
Size

Several thousand question-answer pairs, in TXT format

Licence

Academic use under conditions. License required for some commercial versions

Description


The TREC-QA dataset includes:

  • Several thousand short questions with factual answers
  • Text passages to be analyzed to find the correct answer
  • Annotations of relevance to the evaluation (good/bad answer)
  • A raw format in TXT or TSV, suitable for supervised training

What is this dataset for?


TREC-QA is used for:

  • Training closed question and answer models (closed QA)
  • Evaluating intelligent search engines based on natural language
  • The development of virtual assistants capable of answering factual questions
  • Analysis of the relevance of answers in ranking tasks

Can it be enriched or improved?


Yes, TREC-QA can be adapted or enhanced:

  • Adding richer contexts or explanations associated with answers
  • Combination with recent datasets like Natural Questions or HotPotqa
  • Multilingual translation for evaluating QA models into other languages
  • Annotation of response types (person, location, date, quantity...)

🔗 Source: TREC-QA Dataset

Frequently Asked Questions

What is the difference between TREC-QA and sQuAD?

sQuAD offers answers extracted directly from a given context, while TREC-QA assesses the ability to choose the correct answer among several, from a larger corpus.

Is TREC-QA still in use today?

Yes, it remains a historical benchmark for factual QA and continues to be used in comparison work or for the initial evaluation of QA models.

Can TREC-QA be combined with generative models?

Yes, even if it is historically associated with ranking, it can be adapted to test generative models such as GPT or T5 by comparing the responses generated to those expected.

Similar datasets

See more
Category

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique.

Category

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique.

Category

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique.