allenai/reward-bench-2
Viewer
•
Updated
•
1.87k
•
3.24k
•
30
Datasets, spaces, and models for Reward Bench 2 benchmark and paper!
Explore and compare LLM reward benchmark scores