Add GPQA evaluation result
#35
by
burtenshaw
HF Staff
- opened
Evaluation Results
This PR adds structured evaluation results using the new .eval_results/ format.
What This Enables
- Model Page: Results appear on the model page with benchmark links
- Leaderboards: Scores are aggregated into benchmark dataset leaderboards
- Verification: Support for cryptographic verification of evaluation runs
Format Details
Results are stored as YAML in .eval_results/ folder. See the Eval Results Documentation for the full specification.
Generated by community-evals
