ai4data/datause-extraction holdout test set performance

World Bank Datause Monitoring Dashboard

Threshold: 0.40

About the Benchmark Dataset (ai4data/datause-holdout)

The evaluation is conducted on the canonical, project-cleaned Holdout v10 dataset (ai4data/datause-holdout). It consists of 1,149 prose-only text chunks (465 positive, 684 negative control records) systematically extracted from diverse humanitarian and development reports.

Key UNHCR Documents Covered:

Ukraine Socio-Economic Inclusion (SEIS) Brief (2025)
Poland Impact on Economy Report (2024)
Livelihoods & LFS briefings (Morocco, Brazil, Zimbabwe, Chad, Myanmar, Indonesia, RBSA)

Key World Bank PADs & Studies:

Costa Rica Results in Education (CORE) PAD (P181174)
Rwanda Socio-economic Inclusion Project II PAD (P509677)
Gold Mining Spillovers in Ghana (Benshaul-Tolonen 2019)

Precision

0.0%

True positive rate

Recall

0.0%

Dataset coverage

F0.5 Score

0.000

Precision-weighted

Filtered Matches

Matching filter criteria

TP: 0 FP: 0 FN: 0

Sub-category counts

Source Document

Record Subset

Evaluation Type

Context Search

About the Benchmark Dataset (ai4data/datause-holdout)

About the Training Dataset (ai4data/datause-train)