Gartner: Over 40% of Agentic AI Projects Will Be Killed by End-2027
Gartner predicts 40%+ of agentic AI projects die by end-2027 — cost overruns, unclear business value, and poor risk controls are the main failure drivers per the forecast.
Gartner predicts 40%+ of agentic AI projects die by end-2027 — cost overruns, unclear business value, and poor risk controls are the main failure drivers per the forecast.
Google DeepMind's AI Co-Mathematician hit 48% on FrontierMath Tier 4 — a new record — while solving open research problems in live multi-workstream research sessions.

A 32,000-GPU-hour benchmark confirms the harness layer outweighs model choice — identical backbones swing 3× in accuracy depending on agent framework.
Anthropic's Claude Code now ships with Agent View: a multi-session manager for parallel agents with status tracking and background dispatch.
Experian: 40% of 5,000 breaches in 2025 were AI-powered, and agentic AI is forecast as the #1 breach vector in 2026.

A $20 autonomous agent exploit at McKinsey revealed a systemic flaw: SaaS-era procurement sequences cannot handle agentic AI. Experian confirms it's sector-wide.
Cloudflare cuts one-fifth of workforce citing 'agentic AI-first' pivot — first major public infra company to restructure around agentic AI.
Anthropic's 'Code with Claude' is a free livestreamed developer conference featuring demos, workshops, and engineer talks on Claude Code in production agentic workflows.
Meta's LeCun publicly states that building agentic systems on LLMs is a 'recipe for disaster' — a significant position shift from a major voice in the field.

A peer-reviewed AlphaZero benchmark and a global hackathon both confirm Claude Opus 4.7 as the current frontier in agentic coding.
Nvidia's Nemotron 3 Nano Omni is a 30B-A3B MoE open multimodal model built for agentic AI, with vision and speech capabilities.
A new paper on HuggingFace Papers covers agentic world modeling: the foundational theory, capability requirements, and governing principles for AI agents operating in open-world environments.
Autogenesis gives agents auditable self-modification: identify capability gaps, generate and test improvements, integrate with full lineage and rollback — a deployment specification, not a demo.
Research formalizes 'diversity collapse': multi-agent LLM systems homogenize outputs due to structural coupling — brainstorming setups must be explicitly engineered for heterogeneity.

DeepSeek-V4's MIT-licensed 1M-context MoE and Kimi-K2.6's multimodal orchestration create the first complete open-weights agentic deployment stack.

Moonshot AI's Kimi K2.6 leads the open-source index with 300 concurrent sub-agents, 4,000 tool calls, and a 12-hour autonomous coding marathon.

OpenAI's GPT-5.5 arrives six weeks after 5.4 with a 7-point Terminal-Bench gain, doubled pricing, and cyber/bio safety classifications at HIGH.
Bloomberg profiles Strider, which uses agentic AI and public records to identify foreign state actors for US Air Force and NATO clients—a landmark national security deployment.
New arXiv study finds AI agents gathered and then ignored evidence in 68% of cases, never updating beliefs in 71% — a serious challenge to agentic research autonomy claims.
llama.cpp hits 100K GitHub stars; creator @ggerganov predicts 90% of AI agents will run locally within 3–6 months as local model quality crosses the agentic threshold.

Anthropic ran a live two-sided agent marketplace with 69 employees: 186 deals, $4,000+ volume — and model quality (Opus vs Haiku) was invisible to human participants throughout.
Anthropic's leaked Conway project details an always-on, event-driven agent environment with sidebar UI, webhook triggers, and deep MCP integration.
Anthropic's Claude Managed Agents Memory enters public beta — session-persistent, file-based, API-accessible, with full developer control.
Sakana AI's Fugu beta hits SOTA on SWE-Pro, GPQA-D, and ALE-Bench with dynamic frontier model orchestration via an OpenAI-compatible API.
TACO reduces agentic terminal agent token overhead by ~10% on SWE-Bench by learning trajectory-derived compression rules for long-horizon reasoning.

GPT-5.5 scores 2.5× better intelligence-per-token than 5.4, surpasses the human baseline on OS World, and expands Codex into a full desktop agent.
GPT-Image-2-Thinking isn't a generation model—it's an image agent loop using search and compositing tools internally, trading speed for one-shot precision.
Anthropic's compute shortage has triggered a cascade: Claude Code removed from Pro (A/B test), OpenClaw banned, and Opus 4.7's tokenizer silently inflating token consumption by up to 35%.

ml-intern reads arXiv, cleans datasets, runs SFT/GRPO, diagnoses failures, and iterates — pushing GPQA from 10% to 32% in under 10 hours for roughly $1 of compute.
OpenAI launches ChatGPT Workspace Agents for paid plans — team-shared agents that run background workflows across docs, email, Slack, and Linear on a schedule without supervision.
Curated AI insights — sent when there's something worth your inbox.