agentmemory Crosses 11.6k Stars: Persistent Memory Daemon for Coding Agents
agentmemory hits 11.6k GitHub stars as a cross-agent persistent memory daemon: 92% fewer tokens/session, 95.2% retrieval accuracy, SQLite-only, Apache-2.0.
agentmemory hits 11.6k GitHub stars as a cross-agent persistent memory daemon: 92% fewer tokens/session, 95.2% retrieval accuracy, SQLite-only, Apache-2.0.
AgentWall is an open-source MCP-proxy runtime safety layer achieving 92.9% enforcement accuracy with sub-millisecond overhead across Claude, Cursor, and Windsurf deployments.
LangChain Deep Agents v0.6 ships with harness profiles, an in-loop code REPL for dataset processing without bash, and streaming — the biggest LangChain OSS release yet.
NVIDIA's SANA family runs 20× smaller and 100× faster than Flux-12B, generating 1024px images in 0.1s on H100 with full Diffusers, ComfyUI, and SGLang integration on day one.

Anthropic's '2028' essay frames the AI finish line as recursive self-improvement, splitting the industry on compute restriction vs open-source export strategy.
Cline open-sources its runtime as @cline/sdk: plugins, scheduled agents, persistent sessions, any model/provider. Repositions from AI IDE to agent infrastructure.
Resemble AI Dramabox: first open-source voice model with built-in cryptographic provenance signature. Expressive performance plus verifiable ownership. Available now.
GLM 5.1 tops Artificial Analysis intelligence index over all closed models. Chinese open-weights model leads SWE-Bench Pro. Intelligence index growth exceeding Moore's Law.
Sulphur 2: uncensored open-weights video model on LTX base, 10s/24fps clips on 16GB VRAM, 125K+ training videos. Available on HuggingFace via ComfyUI/Pinokio.
Datadog Toto 2.0: open-weights TSFM family 4M–2.5B params, Apache 2.0. First time series FM with reliable scaling laws. Leads BOOM, GIFT-Eval, TIME benchmarks.
Hugging Face Hub hits 1M open datasets: the 500K-to-1M doubling took 8 months vs 4 years for the first half — CEO credits AI agents for the acceleration.
ByteDance open-sources a 7B desktop GUI control model capable of operating any application without app-specific training — 10k GitHub stars on day one.
GGUF local models on Hugging Face hit 176K with monthly creation rates doubling since March — local AI adoption has crossed an inflection point.
HiDream-01 ranks #8 on Artificial Analysis globally — the top open-source image model — with no-VAE pixel generation and best-in-class text rendering.
Zyra's Zia 1 8B is the first frontier-tier model trained fully on AMD Instinct GPUs — Apache 2.0, fits consumer hardware, competes with 235B models.
AI Alliance's Project Tapestry convenes in Paris to build open, sovereign AI infrastructure for seven nations across Asia, Europe, and Southeast Asia.
Mistral's first open-source frontier TTS model delivers 17ms first-audio latency on a single GPU — a step change for voice-enabled AI agents.
Reachy Mini gets a fully open-source voice agent backend on Hugging Face infra — 3,000+ robots connected in 48 hours, costs cut from $20+/day to near zero.
LangChain and Harvey released LAB, an open-source benchmark for AI agents on long-horizon legal tasks covering research, analysis, and document drafting.
OpenAI and AMD, Broadcom, Intel, Microsoft, and NVIDIA released MRC — an open networking protocol reducing wasted GPU time in large AI training clusters.
Voice-Pro OSS: full YouTube → 100-language dubbed video pipeline on 4GB NVIDIA GPU, free and local, replaces $23-48/hr cloud dubbing services. 3,439 GitHub likes and rapidly viral.
Turkey's Presidential Communications Directorate joins Hugging Face as first public institution on the platform — HF CEO Clément Delangue calls for global government sovereign AI via open-source.
Superlinked 'Sie' open-sourced: hot-swap multi-model inference on single GPU, per-family forward passes (BERT/Qwen/ColBERT), variable-length flash attention, KEDA auto-scaling. Addresses 'context rot' preprocessing gap.
DocuSeal (AGPL-3.0, 11K+ stars) replicates DocuSign in 30 minutes on a €5/mo VPS — 13 field types, eIDAS/HIPAA compliance, vs DocuSign's $17,250/yr median contract.
Nous Research's Hermes Agent v0.9.0 records sessions as reusable skill trajectories — older installs outperform fresh ones as skills compound over time. Supports Claude, GPT, Kimi, Ollama, OpenRouter.
After Claude rate limits, a researcher ran 10M+ tokens on DeepSeek V4 and was shocked by the cost gap. swyx: 'Efficiency is back on the menu again boys.'
AgenticQwen-30B-A3B scores 50.2 avg matching Qwen3-235B on tool-use benchmarks. Dual RL flywheels flip the cost curve for production agents.
Vibe-kanban shut down live onstage at AIE Europe despite 30K MAU; CEO cited failure to enterprise-sell and failure to resell tokens. Project now open-sourced.
Aiden open-sources a local AI OS for Windows/Linux: 1,500+ skills, 89+ tools, 6-layer knowledge graph memory, subagent swarms, voice, Discord/Telegram — Ollama-backed.
VectifyAI's OpenKB builds structured wikis from raw documents without chunking or vectors, using LLMs to accumulate cross-document links and contradiction detection.
Qwen3.6-27B (Apache 2.0) beats Qwen3.5-397B on coding benchmarks, validating that targeted training recipe outperforms brute parameter scale for agentic coding tasks.
Ai2 releases BAR (Branch-Adapt-Route): modular MoE post-training with +16.5 coding and +13 math on BAR-5x7B, linear per-domain update cost. Apache 2.0, full checkpoints.
Kye Gomez releases OpenMythos: a ~770M open-source hypothetical Mythos architecture (Recurrent-Depth Transformer + MoE + LoRA adapters) gaining thousands of GitHub stars.
NVIDIA open-releases Nemotron 3 Nano Omni (30B MoE/3B active): unified video/audio/image/text model with 9× video-reasoning capacity improvement vs. predecessors.
Tencent open-sources Cube Sandbox: a RustVMM/KVM agent VM runtime with sub-60ms cold start, under 5MB per instance, and drop-in E2B SDK compatibility.
HeyGen open-sources HyperFrames: an HTML-to-video framework via Puppeteer/FFmpeg with native Claude Code, Cursor, Codex, and Gemini CLI skills. Apache 2.0, Node.js 22+.
Agent-generated PRs to HuggingFace transformers quadrupled; bulk-merging hundreds of them showed zero benchmark regression on arc_challenge, gsm8k, hellaswag.
APEX-Agents benchmark for consultant/lawyer/banker-level AI work now has a Hugging Face leaderboard for open-source model evaluation.
NimbleList is an open-source GUI layered over Claude Code and Codex CLI, adding Kanban, Mermaid, Excalidraw, and plan.md without a new SaaS subscription.
China's Ling-2.6-1T—a publicly available, inspectable trillion-parameter model—claims to use fewer tokens than comparable US efficient models.
TIME100 names Hugging Face (top-10 AI) and Mistral AI among 2026's most influential companies—both positioned explicitly against closed-model lock-in.
Warp terminal open-sources its full Rust codebase under AGPL v3, hitting 40K stars in hours, with OpenAI as founding sponsor powering its Oz agent system.
DeepSeek V4 is out: 1.6T open-source parameters, 1M token context, 3.7× fewer FLOPs than V3.2, scoring 120/120 on Putnam 2025.

Warp's full Rust codebase lands on GitHub under AGPL v3 with an agent-first contribution model: Oz handles coding, you review the output. OpenAI sponsors.
BAML's Schema-Aligned Parsing extracts typed structured outputs from any LLM — including Deepseek-R1 and O1 — regardless of native tool-calling support.
OpenAI's open-source realtime-voice-component React package lets developers add GPT-powered voice control to apps via narrow action definitions.
Shimmy v1.9.0 is a 4.8MB single-binary OpenAI-compatible local inference server that bundles all GPU backends and claims 142x size advantage over Ollama.
Developers running DeepSeek V4 Flash with 2-bit selective GGUF via llama.cpp describe it as 'the first time I feel I have a frontier model running on my computer' — a milestone for local AI.
Google DeepMind releases Gemma 4 under Apache 2.0: four models including a 31B dense ranking #3 globally on LM Arena and on-device E2B/E4B variants with audio, vision, and 256K context.
OpenAI open-sources Symphony, an agent orchestration spec that uses project management boards as control planes for coding agents — replacing bespoke orchestration glue with a shared standard.
HuggingFace releases smol-audio: an open notebook collection covering local fine-tuning of Whisper, Parakeet, Voxtral, Audio Flamingo 3, and Dia-1.6B TTS — a complete local audio AI toolkit.
Unsloth, the fine-tuning optimization library, has surpassed Microsoft on HuggingFace to enter the top 10 most-followed organizations — reflecting open-source momentum over Big Tech on the platform.
DS2API exposes DeepSeek Web as OpenAI/Claude/Gemini wire-compatible APIs — reverse-engineered, disclaimer-heavy, but a clear signal of demand for free API access to frontier-adjacent models.
Qwen3.6-27B drops quietly under Apache 2.0: AAII score 46, optimized for M-series local inference, strong agentic coding — the best dense local model available.

DeepSeek-V4's MIT-licensed 1M-context MoE and Kimi-K2.6's multimodal orchestration create the first complete open-weights agentic deployment stack.
Decepticon is an open-source autonomous AI red-team agent that reasons through attack paths and tests business logic—governed by strict scope, isolation, and logging requirements.
NVIDIA Ising uses AI to shrink quantum computer calibration from days to hours and provides 3D neural network error correction faster than existing open-source methods.
Qwen3.6-35B-A3B distilled from Claude Opus 4.6 reasoning traces runs full agentic workflows locally at 13 GB in 2-bit mode—raising significant provider ToS questions.
RF-DETR (ICLR 2026) sets new COCO detection SOTA; the Apache 2.0-licensed RF-DETR-L outperforms YOLO26-X on accuracy, ending AGPL lock-in for production CV.
Xiaomi MiMo 2.5 Pro ties #1 on Artificial Analysis and autonomously built a full desktop video editor in 11.5 hours—open-source release incoming.
lakehq/sail: Rust-native Spark replacement built on DataFusion and Arrow hits 4× faster TPC-H, 94% cost reduction, zero shuffle spill — PySpark code runs unchanged.
llama.cpp hits 100K GitHub stars; creator @ggerganov predicts 90% of AI agents will run locally within 3–6 months as local model quality crosses the agentic threshold.
DeepSeek V4-Pro open-sourced with 1.6T params, 1M context window, and 10x KV cache reduction vs V3.2 — #1 HuggingFace trending in 43 minutes.
Moonshot's Kimi K2.6 runs 300 parallel sub-agents for 12+ hours autonomously at $0.60/M input tokens — open-weight, HuggingFace-available.
Meta releases Sapiens2 — high-res vision transformers pretrained on 1B human images for pose estimation, segmentation, depth, and point maps.
LlamaIndex open-sources LiteParse—a VLM-free, ML-free PDF parser using a 6-step grid projection algorithm that handles tables and complex layouts fast.
TrackioApp launches free local-first trace logging for AI agents, offering lightweight observability with no cloud dependency or cost.
HuggingFace's ml-intern autonomously runs the full ML research-to-training loop, lifting GPQA from 10% to 32% on a 1.7B model in <10 hours and beating Codex on HealthBench by 60%.
OpenAI's Privacy Filter — a 1.5B MoE model for PII detection under Apache 2.0 — is the company's first open-weight release of 2026, running on-device with 128k context.
Qwen3.6-27B (Apache 2.0) claims to outperform the 397B Qwen3.5 MoE and Claude Opus 4.5 on coding benchmarks, running locally on 18GB RAM.

ml-intern reads arXiv, cleans datasets, runs SFT/GRPO, diagnoses failures, and iterates — pushing GPQA from 10% to 32% in under 10 hours for roughly $1 of compute.
Alibaba's Apache 2.0 27B model outperforms Qwen3.5-397B-A17B on all major coding tasks and runs locally on 18 GB RAM — 'bye bye subscription era' claims are spreading.
Meta released Llama 4 under an updated open-source license featuring a 2M token context window.
Meta released Llama 4 under its updated open-source license, featuring built-in context engineering primitives and a 2M token context window — a significant milestone for the open-source LLM ecosystem.
A comparative analysis of the open-source LLM ecosystem entering Q2 2026 — benchmarking performance against proprietary alternatives, mapping the licensing landscape, and calculating total cost of ownership for self-hosted deployments.
Curated AI insights — sent when there's something worth your inbox.