Tools16 articles

Tools

Two vertical cost bars showing $1/task vs $11/task at equal 79.8% benchmark accuracy

Cursor Composer 2.5: 79.8% SWE-Bench at Under $1/Task

Cursor's Composer 2.5 hits 79.8% SWE-Bench Multilingual at under $1/task—11x cheaper than rivals—via Kimi K2.5 fine-tuned on 25x more synthetic tasks.

May 19, 20262 min read

Developer steering a remote Codex code agent from a smartphone while the desktop workstation runs autonomously in the background

ToolsSignificant

OpenAI Ships Codex Mobile — Steer Long-Running Agents from Your Phone

OpenAI extends Codex to iOS and Android on all plans, letting developers monitor and redirect multi-step coding tasks while away from their computer.

May 18, 20262 min read

LangSmith Engine scanning agent traces and auto-generating code fixes above the SmithDB observability infrastructure

ToolsSignificant

LangChain Ships 7 Products at Interrupt 2026 in San Francisco

LangChain's most ambitious product day: Engine auto-fixes agent traces, SmithDB delivers 12x observability performance, 5 more products at Interrupt 2026.

May 15, 20262 min read

AI-powered cybersecurity operations center with neural-network shield and automated vulnerability remediation

ToolsSignificant

OpenAI Launches Daybreak: AI-Powered Vulnerability Detection and Code Security

OpenAI's Daybreak combines frontier models and Codex to automate vulnerability detection, backlog triage, and continuous software security at enterprise scale.

May 12, 20262 min read

Stylized 3D render of a multi-agent terminal dashboard with green and amber session status indicators

ToolsSignificant

Anthropic Ships Claude Code Agent View for Parallel Multi-Session Orchestration

Anthropic ships Agent View for Claude Code: a unified terminal dashboard for managing parallel sessions with status indicators, inline replies, and /goal.

May 12, 20262 min read

Abstract representation of an AI agent in a dreaming state, with memory orbs consolidating into crystalline knowledge structures in a deep blue neural landscape

ToolsNotable

Anthropic Ships "Dreaming" for Claude Managed Agents

Anthropic ships dreaming, outcomes, and multi-agent orchestration to Claude Managed Agents — the first production off-cycle memory consolidation for long-running agents.

May 10, 20262 min read

A 3D render of a central skill.md file orbited by colorful AI tool nodes connected by data streams, representing cross-vendor agent skill standardization

ToolsNotable

skill.md Becomes the Cross-Vendor Agent Extension Standard

Claude Code hits 100+ skills and Google Gemma 4 independently adopts the same skill.md format — cross-vendor agent extension standardization is happening.

May 4, 20262 min read

AI assistant connecting to multiple creative software tools including 3D modeling, audio production, and graphic design

ToolsNotable

Claude Gets Native Connectors for Adobe, Blender, Autodesk, and Canva

Anthropic's Claude for Creative Work lets Claude operate inside Adobe Creative Cloud, Blender, Autodesk Fusion, Ableton, and Canva — moving AI from content generator to software operator.

May 3, 20262 min read

Dark terminal with three glowing AI agent nodes and an open-source star constellation representing Warp launch

ToolsNotable

Warp Goes Fully Open-Source: AGPL v3, Agent-First Contribution Model

Warp's full Rust codebase lands on GitHub under AGPL v3 with an agent-first contribution model: Oz handles coding, you review the output. OpenAI sponsors.

May 1, 20262 min read

AI-powered code security scanner detecting vulnerabilities in a holographic enterprise codebase visualization

ToolsNotable

Anthropic Opens Claude Security Public Beta for Enterprise

Anthropic moves Claude Security to public beta for Enterprise — Opus 4.7 scans codebases, validates findings, and suggests patches, no integration required.

May 1, 20262 min read

Claude AI interface connected to eight creative tool applications via glowing MCP protocol streams in a 3D rendered dark digital workspace

ToolsNotable

Claude Gains 8 MCP Creative Connectors: Blender, Adobe CC, Ableton

Anthropic launches 8 MCP-based connectors for Claude: Blender, Adobe CC, Autodesk Fusion, Ableton, and more. Any MCP-compatible model can use them.

April 29, 20262 min read

Six microVM pods orbiting a shared plan document with three cursor markers, navy and teal palette

ToolsNotable

GitHub Next's ACE Positions Alignment as the New Coding Bottleneck

GitHub Next's ACE puts multiplayer microVM sessions at the centre of agent-driven coding — making team alignment, not implementation, the bottleneck.

April 27, 20262 min read

Engineer at glowing terminal beneath a ceiling of layered code blueprints, amber-lit

Tools

Matt Pocock's Counter-Thesis: The Codebase Is the Agent's Ceiling

Matt Pocock's two-hour AI Engineer workshop argues 30-year-old software fundamentals matter more under AI, not less — and outlines a complete methodology to prove it.

April 25, 20262 min read

Tools

Claude Code Regression: Three Harness Issues, One Public Post-Mortem

Anthropic published a post-mortem on three sequential Claude Code harness changes from March–April that degraded output quality, fixed in v2.1.116+.

April 24, 20262 min read

Green-glowing ML training terminal in darkness, reward curve ascending, no human present

Tools

ml-intern: HuggingFace Releases a Full-Loop Autonomous Post-Training Agent

ml-intern reads arXiv, cleans datasets, runs SFT/GRPO, diagnoses failures, and iterates — pushing GPQA from 10% to 32% in under 10 hours for roughly $1 of compute.

April 23, 20262 min read

Tools

Anthropic Adds Persistent Memory to Claude Enterprise

Anthropic's Claude Enterprise tier now includes cross-session persistent memory, bringing it into direct competition with OpenAI's newly announced GPT-5.1 memory features.

April 12, 20262 min read

AI Intelligence Newsletter

Curated AI insights — sent when there's something worth your inbox.