LexGlue

LexGlue is an NLP benchmark dedicated to the legal field, designed to assess the performance of models on tasks such as the classification of decisions, the prediction of violated articles, or legal MCQs. It combines seven subsets of data, each with a specific objective, to promote the emergence of efficient multi-tasking models in the field of law.

Download dataset

Size

Over 7 sub-datasets (classif., QA), JSON files, thousands of annotated legal documents

Licence

CC-BY 4.0

Description

‍

LexGlue is a legal NLP benchmark combining seven sub-datasets covering different jurisdictions (EU, US) and tasks (multi-label classification, MCQ, prediction of legal articles, etc.). It makes it possible to evaluate “foundation” models on various tasks in law, like GLUE or SuperGlue but dedicated to the legal field. Each dataset has been pre-processed to facilitate its use by legal AI researchers or practitioners.

‍

What is this dataset for?

‍

Testing the robustness of multi-tasking models in a realistic legal framework
Train an LLM to understand, file, or reason about legal documents
Develop LegalTech systems (contractual analysis, decision prediction, etc.)

‍

Can it be enriched or improved?

‍

Yes, LexGlue can be enriched by adding new jurisdictions or annotation formats (e.g. summary of arguments, majority vs minority decisions). Its modular format also makes it easy to merge with other legal bodies for more comprehensive training. It can also be used as a basis for adaptation to French-speaking or multilingual contexts via controlled translation.

‍

🔎 In summary

Criterion	Evaluation
🧩 Ease of use	⭐⭐⭐⭐✩ (Well-structured with provided scripts)
🧼 Need for cleaning	⭐⭐⭐⭐⭐ (Low – ready-to-use data)
🏷️ Annotation richness	⭐⭐⭐⭐⭐ (Excellent – multiple annotation types depending on the task)
📜 Commercial license	✅ Yes (CC-BY 4.0)
👨‍💻 Beginner friendly	⚠️ Moderate – better suited for structured projects
🔁 Fine-tuning ready	🎯 Perfect for adapting a model to Legal AI use cases
🌍 Cultural diversity	⚡ Medium – focus on European and US law