GPQA Diamond
Graduate-level science questions, expert-curated.
Leaderboard
| # | Model | Provider | % | Evaluated | Source |
|---|---|---|---|---|---|
| 1 | Claude Opus 4.7 | Anthropic | 83.3% | — | |
| 2 | o1 | OpenAI | 78.0% | — |
Graduate-level science questions, expert-curated.
| # | Model | Provider | % | Evaluated | Source |
|---|---|---|---|---|---|
| 1 | Claude Opus 4.7 | Anthropic | 83.3% | — | |
| 2 | o1 | OpenAI | 78.0% | — |