samek.fyi

Pareto frontier LLMs, Aider edition

Google released an impressive improvement to Gemini, which now tops the WebDev Arena Leaderboard. That position had been held by Claude for quite a while.

Here's a Pareto frontier for the Aider polyglot coding leaderboard. The analysis of March's Gemini didn't include reasoning tokens in the cost. They fixed the problem for May's release.

pareto-aider

It's too bad it doesn't include o4-mini (medium), especially because the new Gemini sits right between o4-mini (high) and o3-plus-GPT-4.1. It's nice to see that the corrected pricing makes the Pareto frontier make sense again. Gemini 2.5 Pro wasn't some wild outlier.

I excluded some legacy models to reduce clutter, as well as GPT-4.1 nano, which did poorly enough to make its inclusion on the graph not useful.