Evaluating LLMs on their Wordle-solving capabilities
| Rank | Model | Provider | Success Rate | Avg Guesses |
|---|---|---|---|---|
| 1 | gemini-3-pro-preview |
|
3.62 | |
| 2 | gpt-5-mini | openai |
|
3.51 |
| 3 | claude-opus-4.6 | anthropic |
|
3.61 |
| 4 | gpt-5.2 | openai |
|
3.55 |
| 5 | kimi-k2.5 | moonshotai |
|
3.74 |
| 6 | gpt-5-nano | openai |
|
3.64 |
| 7 | gemini-3-flash-preview |
|
3.88 | |
| 8 | glm-5 | z-ai |
|
3.57 |
| 9 | claude-sonnet-4.5 | anthropic |
|
3.72 |
| 10 | gpt-oss-120b | openai |
|
3.90 |
| 11 | gpt-oss-20b | openai |
|
3.78 |
| 12 | claude-haiku-4.5 | anthropic |
|
4.02 |