Anthropic Releases Claude Opus 4.8 at Flat Pricing
Anthropic's Opus 4.8 raises SWE-bench to 69.2%, cuts Fast mode cost by 3×, and ships Dynamic Workflows for parallel sub-agent orchestration.
Anthropic's Opus 4.8 raises SWE-bench to 69.2%, cuts Fast mode cost by 3×, and ships Dynamic Workflows for parallel sub-agent orchestration.
Alibaba's Qwen 3.7 Max enters the top-10 intelligence index above Gemini 3.5 Flash and solves a 12-month-old proprietary puzzle benchmark on first attempt — generation-over-generation reasoning benchmark: 8 to 44.
Gemini 3.5 Flash scores ~50% on Cursor Bench, 15 points below frontier at 4x the cost of Cursor Composer 2.5. Gemini 3.5 Pro delayed another month, narrowing Google's benchmark recovery window.
Meta releases MuseSpark, its first proprietary closed model at an estimated ~70B×16 parallel agents (>1T effective parameters), ranking just below Claude on major leaderboards.
Google DeepMind releases Gemma 4 under Apache 2.0: four models including a 31B dense ranking #3 globally on LM Arena and on-device E2B/E4B variants with audio, vision, and 256K context.
Sam Altman signaled via emoji reply that GPT-5.5 or GPT-6 may launch April 23 — confirmed independently by @swyx. No official announcement published yet.
Anthropic releases Opus 4.7 with SWE-bench Verified up 7 points, vision ceiling tripled to 3.75 MP, a new xhigh reasoning tier, and no price increase.
Curated AI insights — sent when there's something worth your inbox.