This is a list of AI benchmarks I’m watching. Last updated June 11, 2026.
- ALE-Bench
- ARC Prize
- Artificial Analysis
- Bullshit Benchmark
- CAIS AI Dashboard
- CursorBench
- DeepSWE
- Design Arena
- Epoch AI Models
- EQ-Bench 3
- EQ-Bench Creative Writing
- ExploitBench
- Frontier Code
- FutureSearch Benchmarks
- GBA Eval
- Gert Labs Rankings
- Kagi LLM Benchmark
- lechmazur benchmarks
- LiveBench
- LMArena
- MathArena
- Mercor Apex
- METR Time Horizons
- Pencil Puzzle Bench
- ProgramBench
- RuneScape Bench
- Scale Labs
- SimpleBench
- SWE-Marathon
- SWE-rebench
- Terminal-Bench
- Toloka Arena
- Vals Index
- Vending-Bench 2
- Vending-Bench Arena
- VoxelBench
- WeirdML
- Wolfram LLM Benchmarking Project
Meta
Composite indexes built from other benchmarks’ published scores.