---
language:
- de
multilinguality:
- monolingual
task_categories:
- text-retrieval
source_datasets:
- https://github.com/Christoph911/AIKE2021_Appendix
task_ids:
- document-retrieval
config_names:
- corpus
tags:
- text-retrieval
dataset_info:
  - config_name: default
    features:
      - name: query-id
        dtype: string
      - name: corpus-id
        dtype: string
      - name: score
        dtype: float64
    splits:
      - name: test
        num_examples: 200
  - config_name: corpus
    features:
      - name: _id
        dtype: string
      - name: title
        dtype: string
      - name: text
        dtype: string
    splits:
      - name: corpus
        num_examples: 200
  - config_name: queries
    features:
      - name: _id
        dtype: string
      - name: text
        dtype: string
    splits:
      - name: queries
        num_examples: 200
configs:
  - config_name: default
    data_files:
      - split: test
        path: qrels/test.jsonl
  - config_name: corpus
    data_files:
      - split: corpus
        path: corpus.jsonl
  - config_name: queries
    data_files:
      - split: queries
        path: queries.jsonl
---

**LegalQuAD**

- Original link: https://github.com/Christoph911/AIKE2021_Appendix
- The dataset consists of questions and legal documents in German.
- The corpus set consists of the legal documents.
- The query set includes questions pertaining to legal documents.

**Usage**
```
import datasets

# Download the dataset
queries = datasets.load_dataset("mteb/LegalQuAD", "queries")
documents = datasets.load_dataset("mteb/LegalQuAD", "corpus")
pair_labels = datasets.load_dataset("mteb/LegalQuAD", "default")
```