NVIDIA Releases SANA: 20× Smaller, 100× Faster Than Flux-12B
NVIDIA's SANA family runs 20× smaller and 100× faster than Flux-12B, generating 1024px images in 0.1s on H100 with full Diffusers, ComfyUI, and SGLang integration on day one.
NVIDIA's SANA family runs 20× smaller and 100× faster than Flux-12B, generating 1024px images in 0.1s on H100 with full Diffusers, ComfyUI, and SGLang integration on day one.
ByteDance and Kuaishou lead US rivals in video generation — their short-form video training corpora from TikTok and Kuaishou are a structural data moat compute alone cannot close.
Sulphur 2: uncensored open-weights video model on LTX base, 10s/24fps clips on 16GB VRAM, 125K+ training videos. Available on HuggingFace via ComfyUI/Pinokio.
Higgsfield's new MCP and CLI integration for Claude Code turns a single chat prompt into full brand creative packages — research, photos, ads, and UGC videos — with reusable skill files.
Alibaba Happy Horse tops Artificial Analysis video leaderboard but fails independent real-world tests for physics and prompt fidelity — a benchmark contamination signal.
HeyGen open-sources HyperFrames: an HTML-to-video framework via Puppeteer/FFmpeg with native Claude Code, Cursor, Codex, and Gemini CLI skills. Apache 2.0, Node.js 22+.
Pika Labs launches Pika Agents — a conversational AI creative partner with voice, face, and personality — replacing the traditional prompt-box paradigm.
Microsoft's World-R1 applies 3D spatial constraint reinforcement during text-to-video generation to improve geometric coherence in synthesised video.
Curated AI insights — sent when there's something worth your inbox.