5 articles

#alibaba

Alibaba's AgenticQwen-30B (3B Active) Matches Qwen3-235B on Tool-Use

AgenticQwen-30B-A3B scores 50.2 avg matching Qwen3-235B on tool-use benchmarks. Dual RL flywheels flip the cost curve for production agents.

May 4, 20261 min read

Technologybreaking

Alibaba's Happy Horse Ranks #1 on Artificial Analysis But Fails Tests

Alibaba Happy Horse tops Artificial Analysis video leaderboard but fails independent real-world tests for physics and prompt fidelity — a benchmark contamination signal.

May 3, 20261 min read

Technologybreaking

Qwen3.6-27B Outperforms 10× Larger Qwen3.5-397B on Coding Tasks

Qwen3.6-27B (Apache 2.0) beats Qwen3.5-397B on coding benchmarks, validating that targeted training recipe outperforms brute parameter scale for agentic coding tasks.

May 3, 20261 min read

Technologybreaking

Qwen3.6-27B: 27B Model Claims to Beat 397B MoE on All Coding Benchmarks

Qwen3.6-27B (Apache 2.0) claims to outperform the 397B Qwen3.5 MoE and Claude Opus 4.5 on coding benchmarks, running locally on 18GB RAM.

April 23, 20261 min read

Technology

Qwen3.6-27B Surpasses a 397B Model on Coding Benchmarks

Alibaba's Apache 2.0 27B model outperforms Qwen3.5-397B-A17B on all major coding tasks and runs locally on 18 GB RAM — 'bye bye subscription era' claims are spreading.

April 23, 20262 min read

AI Intelligence Newsletter

Curated AI insights — sent when there's something worth your inbox.