🤖 AI Tools
· 6 min read

AI Dev Weekly #5: Anthropic's Too-Dangerous Model, $30B Revenue, and China's GLM-5.1 Beats Everyone


AI Dev Weekly is a Thursday series where I cover the week’s most important AI developer news — with my take as someone who actually uses these tools daily.

This was the biggest week in AI since GPT-4 dropped. Anthropic built a model too dangerous to release, hit $30B in revenue, and launched managed agents. Meta shipped its first model from the $14B Alexandr Wang deal. And a Chinese lab released an open-source model that beats GPT-5 and Claude on the hardest coding benchmark. Let’s get into it.

Anthropic built Claude Mythos — and won’t release it

Anthropic launched Project Glasswing this week, revealing Claude Mythos Preview — a model so good at finding software vulnerabilities that they decided it’s too dangerous for public access. Mythos autonomously discovered thousands of zero-day flaws across every major operating system and web browser, including a 17-year-old remote code execution bug in FreeBSD.

Partners including AWS, Apple, Google, Microsoft, CrowdStrike, and the Linux Foundation are getting early access to patch critical systems, backed by $100M in usage credits.

My take: This is either genuinely responsible AI safety or the most effective marketing campaign in tech history. Probably both. The “too dangerous to release” framing is straight from OpenAI’s GPT-2 playbook in 2019 — and it works just as well now. The difference is Mythos actually found real zero-days that are being patched, which gives the claim more credibility than “this text generator is too good.”

For developers: the practical impact is that your dependencies are getting security patches faster. The philosophical impact is that we’re entering an era where AI finds vulnerabilities faster than humans can fix them.

Anthropic hits $30B revenue, surpasses OpenAI

Anthropic’s run-rate revenue hit $30 billion, up from $9 billion at the end of 2025. They’ve surpassed OpenAI for the first time. More than 1,000 business customers now spend over $1 million annually.

They also signed a deal with Google and Broadcom for 3.5 gigawatts of next-generation TPU compute starting in 2027.

My take: The revenue number is staggering but the compute deal is the real story. 3.5 gigawatts is a small city’s worth of power. Anthropic is betting that demand for Claude will continue to grow exponentially — and given that they just launched managed agents, they’re probably right.

For context: if you’re using Claude Code or any Claude-based tool, you’re part of this revenue. The Pro subscription model is clearly working.

Z.ai’s GLM-5.1 beats GPT-5 and Claude on coding

Z.ai (formerly Zhipu AI) released GLM-5.1, a 754-billion-parameter open-source model under the MIT license that scored #1 on SWE-Bench Pro at 58.4 — beating GPT-5.4 (57.7), Claude Opus 4.6 (57.3), and Gemini 3.1 Pro (55.1).

The headline feature: GLM-5.1 can work autonomously on a single coding task for up to eight hours straight. In a demo, it built a full Linux desktop environment from scratch.

The weights are free on Hugging Face.

My take: This is the most significant open-source model release since Llama 4. An MIT-licensed model beating every proprietary model on the hardest coding benchmark changes the economics of AI coding tools. If you’re building with OpenCode or any model-agnostic tool, GLM-5.1 is worth testing immediately.

The eight-hour autonomous coding claim is wild. Most AI coding sessions today last 30 minutes before the model loses context or goes off track. If GLM-5.1 genuinely maintains coherence for eight hours, it’s a step change in what AI agents can do.

Meta ships Muse Spark — and it’s not open source

Meta debuted Muse Spark, the first AI model from its Superintelligence Labs led by Alexandr Wang (the $14.3B Scale AI acquisition). It’s proprietary — a break from Meta’s open-source Llama tradition — and powers the Meta AI app across Facebook, Instagram, and WhatsApp.

Meta says they plan to eventually open-source future Muse models.

My take: Meta going proprietary is a big deal. Llama 4 was the backbone of the open-source AI ecosystem. If Meta’s best models are now closed, the open-source community loses its biggest contributor. The “we’ll open-source future models” promise is vague enough to mean nothing.

For developers: Muse Spark is only available through Meta’s platforms for now. If you need open models, Gemma 4, Qwen 3.5, and now GLM-5.1 are your best options.

Anthropic launches Claude Managed Agents

Anthropic released Claude Managed Agents in public beta — APIs for building and deploying cloud-hosted AI agents at scale. The product handles infrastructure, state management, and permissioning.

Launch partners include Sentry (auto-fixing bugs end-to-end), Rakuten (7 hours of autonomous coding), and Notion (delegating work to Claude inside workspaces).

My take: This is Anthropic’s play to own the agent infrastructure layer. Instead of developers building their own agent loops (like we cover in our AI agent guide), Anthropic wants you to use their managed service. The tradeoff is convenience vs lock-in.

The Sentry integration is the most interesting — an AI that automatically fixes bugs when they’re detected in production. That’s the kind of agent use case that actually saves money.

OpenAI proposes robot taxes and a four-day workweek

OpenAI published a 13-page policy paper called “Industrial Policy for the Intelligence Age” proposing robot taxes, a public wealth fund, a four-day workweek, and automatic safety nets that expand when AI disruption crosses thresholds.

My take: The company building the robots is proposing the robot tax. Make of that what you will. The four-day workweek proposal is interesting because it acknowledges that AI will reduce the amount of human labor needed — which is exactly what OpenAI’s products are designed to do.

Quick hits

  • Chinese AI models swept all top 6 spots on OpenRouter’s global usage rankings. Alibaba’s Qwen 3.6 Plus topped the list with 4.6 trillion weekly tokens.
  • Anthropic cut off subscription access for third-party tools like OpenClaw, requiring separate pay-per-token billing. If you’re using Claude through a third-party harness, check your billing.
  • Claude had back-to-back outages Monday and Tuesday. Growing pains from tripling revenue in four months.
  • An AMD AI director called Claude Code “dumber and lazier” since recent updates, filing a detailed GitHub issue calling it “unusable for complex engineering tasks.”
  • OpenAI acquired TBPN, a tech talk show with under 60K YouTube followers, for reportedly hundreds of millions. Nobody understands why.
  • Japan relaxed privacy laws to make itself the “easiest country to develop AI,” removing opt-out options for personal data use.
  • Research found “cognitive surrender” — AI users increasingly abandon logical thinking, uncritically accepting faulty AI answers.

What I’m watching

The GLM-5.1 release is the story to watch. If an MIT-licensed model genuinely beats GPT-5 on coding, the pricing pressure on OpenAI and Anthropic will be enormous. Why pay $20/month for Codex CLI when a free model does it better?

The managed agents space is heating up fast. Anthropic, OpenAI, and Google are all racing to be the platform where developers build agents. If you’re building anything with AI agents, now is the time to evaluate your options — before you’re locked into one ecosystem.

And the Anthropic revenue number ($30B) tells us something important: developers are willing to pay for AI tools. The market is real. The question is whether open-source alternatives like GLM-5.1 and Gemma 4 will compress those margins.

See you next Thursday. If you found this useful, share it with a developer friend who’s still reading AI news from three sources instead of one.


Previous issues: #4: Anthropic Leaks Everything · #3: Claude Code Auto Mode

Related: How to Choose an AI Coding Agent · AI Coding Tools Pricing · GLM-5.1 Complete Guide