Pareto frontier LLMs, Aider edition
Google released an impressive improvement to Gemini, which now tops the WebDev Arena Leaderboard. That position had been held by Claude for quite a while.
Here's a Pareto frontier for the Aider polyglot coding leaderboard. The analysis of March's Gemini didn't include reasoning tokens in the cost. They fixed the problem for May's release.
It's too bad it doesn't include o4-mini (medium), especially because the new Gemini sits right between o4-mini (high) and o3-plus-GPT-4.1. It's nice to see that the corrected pricing makes the Pareto frontier make sense again. Gemini 2.5 Pro wasn't some wild outlier.
I excluded some legacy models to reduce clutter, as well as GPT-4.1 nano, which did poorly enough to make its inclusion on the graph not useful.