AI Dev Weekly #2: Garry Tan's 'God Mode', Cursor Composer 1.5, and Anthropic Finds Firefox Bugs
AI Dev Weekly is a Thursday series where I cover the week’s most important AI developer news — with my take as someone who actually uses these tools daily.
This was a weird week. The biggest AI coding story wasn’t a product launch — it was a VC posting his prompt files on GitHub and half the internet calling him a genius while the other half called him delusional.
Garry Tan’s “gstack” breaks the internet
Y Combinator CEO Garry Tan shared his Claude Code setup on GitHub last Wednesday, calling it “gstack.” It’s a collection of 13 skill files — basically reusable prompts that tell Claude Code how to behave in specific roles. One acts as a CEO evaluating ideas, another writes code as an engineer, another reviews that code for bugs and security issues.
Within days it had 20,000 GitHub stars and 2,200 forks. At SXSW on Saturday, Tan told Bill Gurley he has “cyber psychosis” and sleeps four hours a night because he’s so excited about coding with AI agents. He claimed he recreated a startup that originally took $10 million and 10 people — solo, with Claude Code.
Then a CTO friend texted him saying gstack was “god mode” that instantly found a cross-site scripting vulnerability his team had missed. Tan posted the screenshot. That’s when things got spicy.
The backlash was immediate. One founder said the CTO should be fired if a prompt file found bugs his team couldn’t. A vlogger titled his response “AI is making CEOs delusional.” The most common criticism: gstack is just a bunch of prompts in text files. Developers who use Claude Code already have their own versions.
My take: Both sides are right, which is what makes this interesting. Gstack isn’t revolutionary technology — it’s prompt engineering with good organization. But the pattern it demonstrates is genuinely useful: treating Claude Code like a team of specialists instead of one general-purpose assistant. CEO mode for evaluation, engineer mode for building, reviewer mode for catching bugs. That workflow produces better output than just saying “build this feature.”
The real story isn’t gstack itself. It’s that the CEO of Y Combinator — the most influential startup accelerator on the planet — is publicly saying AI agents are replacing entire engineering teams. Whether you think he’s right or delusional, that signal matters. Every YC founder is watching.
Cursor ships Composer 1.5 with RL-trained self-summarization
While everyone was arguing about prompt files, Cursor quietly shipped something actually technical. Composer 1.5 uses reinforcement learning for self-summarization — meaning when the AI hits context limits during long coding tasks, it summarizes its own work to keep going instead of losing track.
The key number: 50% reduction in compaction errors. If you’ve ever had Cursor forget what it was doing halfway through a complex refactor, this is the fix. The self-summarization triggers recursively, so it can handle tasks requiring hundreds of sequential actions without degrading.
My take: This is the kind of unglamorous infrastructure work that actually moves the needle. Long-running coding agents are the future — you describe a feature, walk away, come back to a PR. But they only work if the AI can maintain context across hundreds of file edits. Cursor is solving that problem with RL instead of just throwing more context window at it. For a full breakdown of how Cursor and Claude Code compare, see my head-to-head comparison. Smart approach, and it puts pressure on Claude Code and Copilot to match it.
Claude Opus 4.6 finds 22 Firefox vulnerabilities
Anthropic ran a two-week security audit where Claude Opus 4.6 autonomously analyzed Firefox’s codebase and discovered 22 previously unknown vulnerabilities. Not toy bugs — real security issues in production browser code that human security researchers had missed.
This is a different kind of benchmark. Not “can AI write a to-do app” but “can AI find bugs that professional security teams can’t.” The answer, apparently, is yes.
My take: This is the most underrated story of the week. Everyone’s debating whether AI can replace developers, but the more immediate impact is AI augmenting security teams. Most companies can’t afford dedicated security researchers. If Claude can find XSS vulnerabilities that experienced CTOs miss (looking at you, Garry’s friend), the ROI on running AI security audits is enormous. Expect every major company to start doing this within six months.
Quick hits
Cursor hit a milestone — their blog teased a preview of long-running coding agents achieving 1,000 commits per hour. That’s not a typo. We’re entering the era of AI that doesn’t just suggest code but ships entire features autonomously.
OpenAI launched GPT-5.4 Mini and Nano — smaller, cheaper models optimized for coding and agents. The race to the bottom on pricing continues. DeepSeek V3.2 at $0.28/M tokens already forced everyone’s hand, and now OpenAI is responding with stripped-down models for high-volume use cases.
Andrew Ng launched Context Hub — an open-source tool for managing context in coding agents. If you’re building anything with AI agents, this solves the “how do I feed the right files to the AI” problem that everyone hacks around with custom scripts.
Anthropic launched Claude Agent SDK for Xcode — iOS developers can now use Claude agents directly in Apple’s IDE. The walled garden is opening up, one integration at a time.
The pattern this week
Every story this week points in the same direction: AI coding is moving from “assistant that suggests code” to “agent that ships features.” Garry Tan is using Claude Code as a virtual engineering team. Cursor is building infrastructure for agents that run for hours. Claude is autonomously finding security bugs.
The developers who’ll thrive aren’t the ones who can type fastest. They’re the ones who can direct AI agents effectively — which, ironically, is exactly what Garry Tan’s gstack is trying to do, even if the execution is just “a bunch of prompts.”
AI Dev Weekly drops every Thursday. Subscribe on the homepage so you don’t miss it.
Related: AI Dev Weekly #1: Claude Code Takes the Crown, Musk Raids Cursor