ai4data/datause-extraction holdout test set performance

World Bank Datause Monitoring Dashboard

Threshold: 0.40

About the Benchmark Dataset (ai4data/datause-holdout)

The evaluation is conducted on the canonical, project-cleaned Holdout v10 dataset (ai4data/datause-holdout). It consists of 1,149 prose-only text chunks (465 positive, 684 negative control records) systematically extracted from diverse humanitarian and development reports.

Key UNHCR Documents Covered:
  • Ukraine Socio-Economic Inclusion (SEIS) Brief (2025)
  • Poland Impact on Economy Report (2024)
  • Livelihoods & LFS briefings (Morocco, Brazil, Zimbabwe, Chad, Myanmar, Indonesia, RBSA)
Key World Bank PADs & Studies:
  • Costa Rica Results in Education (CORE) PAD (P181174)
  • Rwanda Socio-economic Inclusion Project II PAD (P509677)
  • Gold Mining Spillovers in Ghana (Benshaul-Tolonen 2019)

Precision

0.0%

True positive rate

Recall

0.0%

Dataset coverage

F0.5 Score

0.000

Precision-weighted

Filtered Matches

0

Matching filter criteria
TP: 0 FP: 0 FN: 0
Sub-category counts