Add GPQA evaluation result

#35
by burtenshaw HF Staff - opened

Evaluation Results

This PR adds structured evaluation results using the new .eval_results/ format.

What This Enables

  • Model Page: Results appear on the model page with benchmark links
  • Leaderboards: Scores are aggregated into benchmark dataset leaderboards
  • Verification: Support for cryptographic verification of evaluation runs

Model Evaluation Results

Format Details

Results are stored as YAML in .eval_results/ folder. See the Eval Results Documentation for the full specification.


Generated by community-evals

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment