Skip to content
WBWikibenchmodel intelligence
Tab
Account
OverviewModelsBenchmarksProvidersLeaderboardCompare
ArticleEditHistory

Chatbot Arena

Chatbot Arena

Category
general
Score unit
elo
Higher is better
yes

Crowdsourced pairwise preference Elo.

Leaderboard

#ModelProvidereloEvaluatedSource
1Gemini 2.0 FlashGoogle DeepMind1356—
2GPT-4oOpenAI1287—
3Claude 3.5 SonnetAnthropic1271—

+ Add result

Wikibench — community-edited AI benchmark data.AboutContent licensed CC BY-SA 4.0.