NanoGPT-Bench: Coding Agents Recover Only 9.3% of Human AI R&D Progress
NanoGPT-Bench finds coding agents including Codex and Claude Code achieve just 9.3% of human AI R&D progress, tuning hyperparams but missing algorithmic research breakthroughs.
NanoGPT-Bench finds coding agents including Codex and Claude Code achieve just 9.3% of human AI R&D progress, tuning hyperparams but missing algorithmic research breakthroughs.

A 32,000-GPU-hour benchmark confirms the harness layer outweighs model choice — identical backbones swing 3× in accuracy depending on agent framework.

Menlo Ventures data shows Anthropic at 34.4% enterprise share vs OpenAI's 32.3% in April, triggering simultaneous free-trial counter-offers from both labs.

Anthropic engineer Thariq argues HTML beats Markdown for AI output; Karpathy backs it with a six-step format evolution toward interactive neural video.

Anthropic ships Agent View for Claude Code: a unified terminal dashboard for managing parallel sessions with status indicators, inline replies, and /goal.
Anthropic's Claude Code now ships with Agent View: a multi-session manager for parallel agents with status tracking and background dispatch.

Anthropic secures all of SpaceX's Colossus 1 (220K GPUs, 300 MW) and doubles Claude Code rate limits—sub-agent fan-out is now viable at $200/month.
Anthropic locked in all of Colossus 1 compute from SpaceX, doubling Claude Code limits and removing peak throttling effective immediately.

Anthropic leases all of SpaceX Colossus 1 (300 MW, 220K GPUs), doubles Claude Code limits, and ships Managed Agents, Dreaming, and Outcomes in one day.
Higgsfield's new MCP and CLI integration for Claude Code turns a single chat prompt into full brand creative packages — research, photos, ads, and UGC videos — with reusable skill files.
Anthropic's 'Code with Claude' is a free livestreamed developer conference featuring demos, workshops, and engineer talks on Claude Code in production agentic workflows.
Claude Code's skills ecosystem exceeds 100 plugins. Six form the production standard: Skill Creator, Superpowers, GSD, /ultra review, Context Mode, ClaudeMem.
AI coding cost per dev doubled as Claude Code and Copilot pricing rises — bugs remain unsolved. Marcus: if coding can't show ROI, the jig may be up.

Nine sources across GitHub, YouTube, X, and newsletters converge on one finding: the model is no longer the performance frontier — the harness is.
Claude Code /loop at 30-minute intervals exceeds the 5-minute cache TTL, guaranteeing expensive cache re-writes. One user accumulated $6,000 in 26 hours on Opus 4.7.
NimbleList is an open-source GUI layered over Claude Code and Codex CLI, adding Kanban, Mermaid, Excalidraw, and plan.md without a new SaaS subscription.
Claude Code's anti-abuse regex misfired on competitor strings in git history and JSON blobs, billing users for usage they never performed. Bug acknowledged; refund posture contradictory.

Three independent research groups converge on April 29 to map a triple security exposure in agentic coding editors: an 81% permission gate FNR, 84% prompt injection success, and systemic plan compliance failures.
Claude Code's new Routines feature turns repeatable engineering workflows—review, test, refactor, release—into codified operational patterns rather than one-off prompts.
Google is building Antigravity, an internal AI coding platform led by Chief AI Architect Koray Kavukcuoglu, explicitly to counter Claude Code and OpenAI Codex.
Anthropic blocked the oh-my-openagent plugin for OpenCode, confirming it is enforcing vendor lock-in at the policy layer rather than competing on capability alone.

Matt Pocock's two-hour AI Engineer workshop argues 30-year-old software fundamentals matter more under AI, not less — and outlines a complete methodology to prove it.
Anthropic fixes three Claude Code harness regressions in v2.1.116+, resets usage limits, and publishes a public post-mortem at anthropic.com.

Anthropic published a post-mortem on three sequential Claude Code harness changes from March–April that degraded output quality, fixed in v2.1.116+.
Anthropic's compute shortage has triggered a cascade: Claude Code removed from Pro (A/B test), OpenClaw banned, and Opus 4.7's tokenizer silently inflating token consumption by up to 35%.
Curated AI insights — sent when there's something worth your inbox.