MATH
Competition mathematics problems.
Leaderboard
| # | Model | Provider | % | Evaluated | Source |
|---|---|---|---|---|---|
| 1 | o1 | OpenAI | 94.8% | — | |
| 2 | Llama 3.3 70B | Meta | 77.0% | — | |
| 3 | Mistral Large 2 | Mistral AI | 76.9% | — | |
| 4 | GPT-4o | OpenAI | 76.6% | — |
Competition mathematics problems.
| # | Model | Provider | % | Evaluated | Source |
|---|---|---|---|---|---|
| 1 | o1 | OpenAI | 94.8% | — | |
| 2 | Llama 3.3 70B | Meta | 77.0% | — | |
| 3 | Mistral Large 2 | Mistral AI | 76.9% | — | |
| 4 | GPT-4o | OpenAI | 76.6% | — |