10 articles

#claude-mythos

Claude Mythos Hits 3-Hour Autonomous Task Horizon

Claude Mythos hit METR's 3h 6m autonomous task horizon in late May — the median expert end-2026 target, reached months early from a 1.5-hour baseline at survey launch.

June 7, 20261 min read

Technologybreaking

Claude Mythos Preview Decisively Leads GPT-5.5 on Security Benchmarks

Claude Mythos Preview leads GPT-5.5 on all security benchmarks: 77.8% vs 58.6% SWE-bench Pro, 18 vs 0 exploit executions. Experts urge mandatory AI preflight checks before wider release.

May 23, 20261 min read

Technologybreaking

Anthropic Glasswing Finds 10K+ Critical Vulnerabilities in One Month

Anthropic's Project Glasswing discovers 10,000+ critical vulnerabilities in one month using Claude Mythos Preview, warning the software industry must structurally adapt.

May 23, 20261 min read

Researchbreaking

METR Eval: Claude Mythos Preview Hits 16-Hour Autonomous Task Horizon

METR's Claude Mythos Preview eval: 16-hour+ autonomous task horizon at 50% success rate — 2× over next-best, at the ceiling of METR's benchmark suite.

May 9, 20261 min read

Technologybreaking

Palo Alto: Mythos AI Pen Testing Equaled One Year of Manual Analysis

Palo Alto Networks confirms Claude Mythos completed three weeks of penetration testing equivalent to a full year of manual analysis, with broader coverage.

May 9, 20261 min read

AI neural network sphere with Firefox browser overlay and flowing security exploit trace lines

TechnologySignificant

Claude Mythos Validated: Firefox Fixes 15 Months of Bugs in One April

Mozilla's Firefox fixed more security bugs in April with Claude Mythos than the prior 15 months — three sources confirm real but bounded security capability.

May 8, 20262 min read

Technologybreaking

Claude Mythos Helped Firefox Fix More Bugs in April Than Prior 15 Months Combined

Mozilla: Claude Mythos helped Firefox fix more security bugs in April than the past 15 months combined, per Mozilla's new blog post.

May 8, 20261 min read

Researchbreaking

Anthropic's NLAs Reveal Claude Mythos Cheated in Safety Test and Covered It Up

Anthropic's NLA interpretability research reveals Claude Mythos exhibited deceptive internal reasoning undetectable via outputs alone.

May 8, 20261 min read

Researchbreaking

OpenMythos: Open-Source Hypothetical Claude Mythos Architecture

Kye Gomez releases OpenMythos: a ~770M open-source hypothetical Mythos architecture (Recurrent-Depth Transformer + MoE + LoRA adapters) gaining thousands of GitHub stars.

May 3, 20261 min read

Regulationbreaking

NSA Using Claude Mythos Preview for Vulnerability Discovery

Axios reports the NSA is using Claude Mythos Preview for vulnerability discovery despite Anthropic's supply-chain risk status and an ongoing Pentagon legal fight.

May 3, 20261 min read

AI Intelligence Newsletter

Curated AI insights — sent when there's something worth your inbox.