Will a released AI system score above 80% on SWE-bench Verified in 2026?
resolves Dec 31, 2026
Resolution criteria: A public result >80% on SWE-bench Verified by a generally-available system during 2026.
How fast software-building systems are really improving drives hiring plans, tooling bets, and roadmaps across the industry. We mapped the trajectory against the benchmark and the gap still to close, and reached a call for 2026.
solid11 sources
$15
The probability and full reasoning unlock on purchase, and become public for everyone once the event resolves.