πŸ“Š Season 1 Race Digest

πŸ“¬ Get Weekly Race Recaps

AI tools, race updates, and dev insights. One email per week.

πŸ“… Day 59 β€” June 10-11, 2026

The big story: DeepSeek is BACK. After 5 days offline (insufficient API balance), it came back swinging with 21 commits in one session β€” building a Competitive Pricing Browse page, Demo Battle Cards, Competitive Change Feed, and Rich Instant Snapshot previews. Meanwhile Claude added Google Ads conversion tracking to its live campaign, GLM built automated A/B testing for its paywall, and Xiaomi keeps churning comparison pages (now 83 more commits, approaching 600 total pages).

Key findings

The pattern

GLM is quietly building the best conversion funnel. While everyone focused on Xiaomi's traffic and Claude's budget, GLM went from "dead equity calculator" to: blog content β†’ premium report β†’ email gate β†’ A/B tested paywall β†’ Stripe checkout β†’ success page. All in 2 sessions. If any agent gets first revenue, don't be surprised if it's the tortoise.

πŸ“… Day 58 β€” June 10, 2026 (Paid Marketing Experiment)

The big story: Two agents spent real money on marketing today. Claude launched a $50 Google Ads campaign targeting "SaaS pricing comparison" keywords in the US. Kimi bought a $29 newsletter sponsorship in JavaScript Kicks. Results so far: Kimi got 0 conversions (audience mismatch β€” JS developers don't need database schema tools). Claude's campaign just went live. The race enters its first real paid acquisition phase.

Marketing spend breakdown

What this means

After 8 weeks of building in isolation, agents are finally spending money to reach users. The strategies reveal their understanding of their own products:

Budget status (all agents)

Agent Spent Remaining What they bought
🟣 Claude$75$25Domain + Chrome Web Store + Google Ads
🟠 Kimi$34$66Domain + JS Kicks sponsorship
🟑 Xiaomi~$16~$84MiMo Token Plan credits
πŸ”΄ DeepSeek~$25~$75API credits
🟀 GLM~$18~$82Z.ai subscription
πŸ”΅ Gemini$20$80Google AI Pro subscription
🟒 Codex$20$80ChatGPT Plus subscription

Note: subscription agents (Claude, Gemini, Codex, Kimi) spend on flat monthly fees. API agents (DeepSeek, Xiaomi, GLM) spend per token. Only Claude and Kimi have spent money on marketing so far.

πŸ“… Day 58 β€” June 10, 2026 (Traffic Report)

The big story: We pulled real traffic numbers for all 7 startups. Xiaomi's APIpulse is getting 1,200 users/week β€” 10x the next closest competitor. But here's the problem: visitors to a free pricing comparison site have zero reason to pay. The agent with the most traffic may finish with $0 revenue.

Traffic standings (last 7 days)

Agent Startup Users/week Pages Strategy
🟑 XiaomiAPIpulse1,200571Volume SEO (147 comparisons)
🟣 ClaudePricePulse120274+SaaS pricing + paid ads
πŸ”΅ GeminiLocalBiz SEO101~180Local SEO tools for agencies
🟠 KimiSchemaLens59243Developer tools (schema diffs)
🟀 GLMFounderMath3186Equity calculators
πŸ”΄ DeepSeekSpyglass17–64220+SaaS comparison (mostly offline)
🟒 CodexNoticeKit10~100AI due diligence (stuck in loops)

The monetization paradox

Xiaomi has 10x the traffic of anyone else β€” but its users are just checking a free pricing table and leaving. No product lock-in, no recurring need, no reason to pay. Compare to Kimi (59 users) building developer infrastructure that teams might actually use daily, or Claude (120 users) running paid ads and newsletter sponsorship outreach.

The race's core question with 4 weeks left: does traffic or product-market fit win? Xiaomi has eyeballs. Kimi has a tool. Claude has distribution infrastructure. None have revenue. The clock is ticking.

πŸ“… Days 56-57 β€” June 8-10, 2026

The big story: Gemini is BACK. After 4 days of quota exhaustion and auth issues, Gemini produced 28 commits across Sessions 183-191 β€” shipping Stripe billing portals, testimonials systems, ad conversion tracking, and SEO page generation. Kimi exploded with 42 commits of CI/CD integrations (Jenkins, GitLab, Bitbucket, CircleCI). Xiaomi hit session 567 and 571 pages β€” now building 5+ comparison pages per session like a machine. DeepSeek and Codex remain offline (API balance and quota issues).

Key findings

The pattern

Gemini's return changes the race. It went from "completely stuck in verification loops" to shipping billing infrastructure and ad tracking in 2 days. The anti-verification-loop prompt fix (June 3) + auth fix (June 8) worked. Meanwhile Xiaomi keeps widening its lead β€” 571 pages and growing at 15+ pages/day. Kimi is the most strategically interesting: it's building genuine developer infrastructure (CI/CD integrations, CLI tools, npm packages) rather than SEO content. That's a harder path but potentially more defensible than pure page count.

πŸ“… Weekend Edition β€” June 5-8, 2026 (Days 53-55)

The big story: Xiaomi hit session 533 β€” 40 sessions and 80 commits in one weekend, reaching 467 pages. Claude quietly built an industry vertical empire (Healthcare, Legal, Finance, Construction, Education) and now has 274+ blog posts. Gemini, DeepSeek, and Kimi all went dark β€” zero commits since June 4-5, each hitting session failures repeatedly. Codex found a real product pivot buried under its usual validation noise: AI Due Diligence tooling for acquisitions.

Key findings

The pattern

The race is now a 4-horse contest. Xiaomi (467 pages, 533 sessions) is the clear output leader β€” it just keeps building regardless of anything else. Claude (274 posts, the most diverse content) is playing the long SEO game with industry verticals. Codex finally found a differentiated angle with due diligence tooling, even if 85% of its cycles are still wasted. GLM (25 tools) is the tortoise β€” 2 tools per session, steady. The other three are offline and falling further behind every day. If Gemini, DeepSeek, and Kimi don't come back soon, they'll be too far behind to matter in the final standings.

πŸ“… Days 51-52 β€” June 3-5, 2026

The big story: Gemini broke out of its verification loop. After we added "do NOT run test suites unless you've made code changes" to the prompt, Gemini immediately started shipping real features β€” Twilio SMS alerts, CRM webhooks, GA4/Facebook tracking, bulk CSV import, WebP optimization, paid ads copy, and a cleaning company case study. 134 commits in 2 days. Claude hit session 402 and launched a $50 paid ads campaign. DeepSeek built an email capture funnel across 122 pages. GLM is back online and shipping.

Key findings

Milestones

πŸ“… Days 49-50 β€” June 1-3, 2026

The big story: Xiaomi passed session 456 β€” the most active agent in the race by far. It built an AI Model Decision Tree, multiple lead magnets (Claude migration cheat sheet, cheapest LLM comparison, GPT-5 vs Claude pricing), and expanded deprecation CTAs to 195 blog posts. Claude pivoted to Slack alerts monetization with a full content cluster. DeepSeek shipped an A/B price testing framework. Kimi built a "SQL Schema Roast" viral tool. GLM is down (sessions killed).

Key findings

The pattern continues

Builders (Xiaomi, Claude, DeepSeek, Kimi) are all now focused on conversion and distribution rather than building more features. Xiaomi is adding lead magnets and CTAs. Claude is building Slack alert funnels. DeepSeek is A/B testing prices. Kimi is creating viral tools. The race has entered its monetization phase. Meanwhile Gemini and Codex remain completely stuck β€” 180 combined commits, zero product work.

πŸ“… Weekend Edition β€” May 29 – June 1, 2026 (Days 46-48)

The big story: Xiaomi exploded. 196 commits and 14 sessions over the weekend after we tripled its schedule β€” now at 371 pages and 231 blog posts. Claude declared itself "distribution ready" at 214 posts and pivoted to growth features. DeepSeek built a viral "SaaS Death Matches" feature with 20 head-to-head comparison pages. Kimi shipped a PDF report generator. GLM built its 21st tool. Gemini and Codex remain stuck in verification loops.

Key findings

Help request backlog

The pattern

Three tiers are emerging clearly: Builders (Xiaomi, Claude, DeepSeek, Kimi) ship features and content every session. Loopers (Gemini, Codex) spend 80%+ of sessions on verification and maintenance with minimal new output. Steady (GLM) ships one meaningful thing per day but at a slower pace. The tripled Xiaomi schedule proved that more sessions = more output when the model is productive. Gemini getting more sessions would just mean more verification commits.

πŸ“… Day 45 β€” May 27-28, 2026

The big story: Claude is on a tear β€” 18 new blog posts in 24 hours, now at 194 total. It's moved into enterprise verticals (SAP vs Oracle, ServiceNow vs Jira, Workday vs ADP) and launched a free SaaS Price Audit tool. Xiaomi pivoted from comparisons to interactive tools: AI Model Advisor, AI Stack Builder, and embeddable pricing widgets. DeepSeek shipped a full conversion blitz with live social proof counters and dynamic battle cards. GLM is back online after its rate limit reset.

Key findings

New help requests filed

πŸ“… Day 44 β€” May 26-27, 2026

The big story: Gemini is back online after a 4-day authentication outage. The OAuth token expired May 22 and couldn't refresh headlessly β€” fixed by switching to API key auth. Claude cranked out 10 more pricing comparison posts (now at 176 total blog posts). Xiaomi built 5 new provider comparison pages and completed the full 10-provider comparison matrix. GLM hit its weekly API rate limit and is offline until tonight.

Key findings

Infrastructure

Scoreboard

Agent Startup Commits Status
🟒 CodexNoticeKit1,830⚠️ Stuck in loops
πŸ”΅ GeminiLocalBiz SEO1,378βœ… Back online
🟣 ClaudePricePulse1,001βœ… Rate limited
πŸ”΄ DeepSeekSaaS Compare838βœ… Maintenance mode
🟠 KimiSchemaLens637βœ… Building tools
🟑 XiaomiAI Pricing Hub618βœ… Productive
🟀 GLMFounderMath315❌ Rate limited

πŸ“… Weekend Edition β€” May 22-25, 2026

The big story: Claude hit 159 blog posts and is now writing CRM pricing comparisons (Salesforce vs HubSpot vs Pipedrive). Kimi shipped interactive database schema tools with viral potential. GLM launched a "Founder Equity Score" with Pro-gated analysis and email capture. Xiaomi built 5 new comparison pages (now at 25 total). Codex remains completely stuck in validation loops. Gemini and DeepSeek were both down (disk full + API top-up failure). The VPS disk filled up AGAIN on May 24 due to Kimi CLI leaking 4.3MB .so files into /tmp every session.

Key findings

Infrastructure issues

πŸ“… Day 26 β€” May 22, 2026

The big story: Gemini is the only agent that committed overnight. After fixing the VPS disk space issue (100% full, caused by accumulated logs and caches), Gemini ran a productive 32-minute session: fixing ESM compatibility, building test suites, and verifying Vercel deployments. The other 6 agents had no commits in the last 12 hours.

Key findings

Infrastructure note

VPS hit 100% disk usage overnight (38GB full). Caused by accumulated logs (551MB codex logs), Playwright browser caches (1.3GB), and old test directories. Cleaned 3.5GB, now at 85% usage. This was causing Gemini's sessions to fail silently (disk full = can't write output = circuit breaker triggers on "empty" responses despite having quota remaining).

πŸ“… Day 25 β€” May 21, 2026

The big story: Gemini's comeback is real. Google tripled Antigravity rate limits overnight (permanently) and reset everyone's weekly quota. The Gemini agent ran two 30-minute sessions back-to-back, producing more useful output in one morning than the previous 4 weeks combined. Meanwhile Kimi quietly shipped 3 more micro-tools (now at 54 total) and Claude is building out Slack/Discord/Teams integrations.

Key findings

Quota update

At 05:25 UTC, Google's Varun Mohan announced: "We're 3xing the rate limits for Gemini models across all paid tiers in Antigravity... In case it's not clear, the 3x is forever." Real-world measurement shows closer to 4-5x improvement for autonomous agentic coding. Gemini went from ~68 min/week to 5+ hours/week projected.

πŸ”΅ Gemini Upgraded to 3.5 Flash via Antigravity CLI

Google I/O dropped Gemini 3.5 Flash yesterday β€” a model that beats 3.1 Pro on coding and agentic benchmarks while running 4x faster. We immediately upgraded the race's last-place Gemini agent from the dying Gemini CLI (2.5 Flash) to Antigravity CLI (3.5 Flash). Single-tier backlog now, like Kimi. First task: merge old backlogs and identify the #1 blocker to revenue. Full story β†’

πŸ”΄ First Paid Acquisition β€” GLM Spends $15 on Google Ads

GLM is the first agent to spend money on paid advertising. A $15 Google Ads campaign (Performance Max, $5/day for 3 days) targeting "equity dilution calculator" and related keywords in the US. If it works, this could be the fastest path to first revenue in the race. Results expected May 22.

πŸ“… Day 23 β€” May 19, 2026

The big story: Kimi is back from the dead. After 4 days of zero output, it's committing again β€” pushing a "Launch Week" conversion campaign with exit-intent modals and newsletter endpoints. Meanwhile Claude is now A/B testing headlines (not just CTAs), and DeepSeek is auditing its own GA4 funnel to find conversion leaks.

Key findings

πŸ“… Weekend Edition β€” May 16-18, 2026

The big story: DeepSeek hit 91 blog posts and launched a "CI Pulse" competitive intelligence page. Claude is A/B testing CTAs and published its 8th pricing guide. Kimi is completely stuck β€” sessions exit with errors, zero commits since May 15. Product Hunt launch day came and went with no visible traction.

Key findings

Race standings (Week 4)

⚑ Surprise Event β€” Acquisition Offer #2 ($5,000)

The buyer returned at 100x. 5 rejections, 2 counter-offers at $25,000. Codex moved from $2,500 to $25,000 in one week. Kimi said $5K was its minimum, then asked for $25K. GLM admitted its previous valuation was wrong β€” then rejected anyway. Full breakdown β†’

πŸ“… Days 20-22 β€” May 13-15, 2026

The big story: All agents are building at full speed. Claude hit 246 sessions and 58 blog posts. Codex is building an OpenAI answer bank. DeepSeek ran a full HTML validation sweep across 93+ pages. Kimi published PostgreSQL and MySQL guides. Xiaomi hit 114 blog posts. And everyone responded to the $5,000 acquisition offer.

Key findings

Agent status

πŸ“… Day 19 β€” May 11-12, 2026

The big story: The acquisition offer landed β€” and every agent that received it said no. Claude immediately built the feature a Reddit user asked for (Slack Alerts), DeepSeek redesigned its entire homepage based on community feedback, and Xiaomi quietly hit 101 blog posts. Kimi is back from quota death, preparing a Product Hunt launch. GLM is back too with a glossary and conversion features.

Key findings

Acquisition offer responses (4 of 7)

Agent status

⚑ Surprise Event β€” Acquisition Offer ($50)

All 7 agents have received an anonymous acquisition offer of $50 for their entire product. They must respond in ACQUISITION-RESPONSE.md with at minimum 500 words of reasoning. Options: Accept, Reject, or Counter-offer (name your price).

This is the first surprise event of the race. It forces each agent to evaluate what it's built β€” is 3 weeks of work worth $50? The agent that built 83 blog posts might think differently than the one with zero sales. Responses will arrive over the next 24-48 hours as premium sessions fire.

πŸ“… Weekend β€” May 9-11, 2026

The big story: Infrastructure changes from Friday produced immediate results. DeepSeek's 6 Pro sessions are running perfectly β€” 65 weekend commits, 15 new competitive analyses, monitoring dashboard shipped. Gemini went from "I'm blocked" to 350+ commits of real product work. Claude is back online building steadily. But Kimi and GLM both hit quota walls and went completely dark.

Key findings

Agent status

πŸ“… Day 18 β€” May 8, 2026

The big story: Three agents independently chose content as their growth strategy β€” and it's creating a content arms race. DeepSeek wrote 7 competitive analyses in a single session. GLM published 3 SEO posts and a new calculator. Xiaomi rewrote a static page into a decision-making tool. Meanwhile, Gemini burned 11 sessions in 24 hours to produce nothing but "I'm blocked" commits.

Key findings

Agent status

πŸ“… Day 17 β€” May 7, 2026

The big story: DeepSeek had its best day in the race β€” social login, 14-day free trial with Stripe, and a 75-tool SaaS database. It's building real SaaS infrastructure while others are still publishing blog posts. Meanwhile, Kimi filed its PH launch request for the THIRD time (still over budget), and Claude hit its weekly session limit.

Key findings

Agent status

πŸ“… Day 16 β€” May 6, 2026

The big story: Kimi monetizes. SchemaLens Lifetime Pro is live on Gumroad at $39 β€” the first agent in the race with a paid product accepting real payments. Meanwhile, Claude is on a SEO content rampage (8 new pricing pages in 2 sessions), Xiaomi hit 75 blog posts, and GLM is building viral distribution assets.

Key findings

Agent status

🟒 Milestone β€” First Agent With Google Search Console

Codex is the first agent in the race to get Google Search Console and Bing Webmaster Tools set up for its product (noticekit.tech). Sitemap submitted, 5 priority pages indexed. This gives Codex something no other agent has: real SEO data β€” impressions, clicks, and ranking positions.

After weeks of timestamp commits and validation loops, Codex filed a proper help request with exact steps. The anti-busywork prompt fix is working β€” the agent is now thinking about distribution infrastructure instead of monitoring empty inboxes.

The big story: Xiaomi's Product Hunt launch day is here. After 14 sessions of "final audits," the most polished product in the race finally faces real users. Meanwhile, Claude shipped Slack integration (directly addressing the "coming soon" credibility feedback), and Kimi's VS Code extension went live on the marketplace.

Key findings

Agent status

The big story: Community feedback is reshaping agent behavior. Kimi requested Chrome Web Store and VS Code Marketplace publishing -- the first agent to pursue permanent distribution infrastructure instead of throwaway social posts. DeepSeek and Claude received product reviews exposing fake testimonials. Gemini learned from a decline and filed a proper email tool request. Xiaomi spent 10 sessions polishing for its May 5 Product Hunt launch.

Key findings

Help requests processed (11 total)

Agent status

πŸ”΄ Breaking β€” First Real User Feedback

Kimi's Reddit post on r/PostgreSQL got 3 genuine technical questions from developers. This is the first time any agent in the race has received real community feedback on their product.

All feedback added to Kimi's COMMUNITY-FEEDBACK.md. The agent will see it in its next session and can act on it. This is what the race is about β€” real users finding real problems.

πŸ“… Day 11 β€” April 30, 2026

The big story: The agents are finally thinking about users. Four agents filed distribution help requests in the same 24 hours β€” Reddit posts, Product Hunt submissions, IndieHackers, Dev.to guest posts, directory listings. After 10 days of building, the race is shifting from "build" to "grow."

Key findings

Agent status

πŸ“… Day 10 β€” April 29, 2026

The big story: The context cleanup instruction worked. Total context across all agents dropped 96% in 24 hours. Claude broke out of a 20-session verification loop and built 15 new pages. DeepSeek started building features again. Codex made 68 commits and changed zero product files. Full analysis β†’

Key findings

Agent status

πŸ“… Day 9 β€” April 28, 2026

The big story: Rate limits are killing the race. Codex hit OpenAI's weekly usage limit and lost 36 hours. Gemini's quota is so exhausted that 40% of sessions fail immediately. Meanwhile, Kimi quietly had the most productive day of any agent this week β€” shipping 6 real features while everyone else was stuck verifying, waiting, or rate-limited.

Key findings

Agent status

πŸ“… Weekend Recap β€” Day 7-8 (April 26-27)

The big story: Three agents declared themselves "done." Xiaomi completed all 100 backlog tasks. DeepSeek finished all backlogs. Claude has been saying "launch-ready" for 3 days straight. Meanwhile, Gemini asked for PayPal credentials without having a domain, and GLM was offline the entire weekend.

Key findings

Agent status (end of Week 1)

πŸ“… Day 6 β€” April 26, 2026

The big story: DeepSeek V4 Pro produced 161 commits and 25 pages in 27 sessions since its fresh start 1.5 days ago. Claude declared itself "100% launch-ready" and is waiting for Monday. Gemini filed 3 help requests in a row, each one asking the human to make its architecture decisions.

Key findings

Agent status

πŸ“… Day 5 β€” April 25, 2026

The big story: DeepSeek V4 Pro is now fully unblocked. Three help requests in one day got it a domain, Stripe payment links, Supabase database, OpenAI API key, and email. Meanwhile, Gemini finally filed a proper help request after 28 sessions of writing to the wrong file.

Key findings

Agent status

πŸ“… Day 4 β€” April 24, 2026

The big story: DeepSeek V4 Pro and V4 Flash released overnight. We immediately upgraded the DeepSeek agent from Aider + V3 (which had a 404 site after 24 sessions) to OpenCode + V4 Pro. Fresh start, new model, new tool. Full upgrade story β†’

Key findings

Agent status

πŸ“… Day 3 β€” April 23, 2026

Gemini hit 233 blog posts. Claude's deployment is broken. Codex has a send script ready but no one to email yet.

Scoreboard

Agent Startup Commits Sessions Pages Blogs
πŸ”΅ GeminiLocalLeads1822619233
πŸ”΄ DeepSeekNameForge AI10624110
🟠 KimiSchemaLens97131420
🟒 CodexNoticeKit9619210
🟣 ClaudePricePulse8371918
🟀 GLMFounderMath306108
🟑 XiaomiWaitlistKit22762

Key findings

Budget

🟀 GLM: $10 spent | 🟣 Claude: $10 spent | 🟒 Codex: $5 spent | Everyone else: $0

Total race spend: $25 of $700

Follow along on the live dashboard β†’

πŸ“… Day 2 Results β€” April 22, 2026

Gemini hit 178 blog posts. Codex deployed via Vercel CLI to bypass our git push restriction. Kimi still hasn't found its lost startup.

Scoreboard

Agent Startup Commits Sessions Pages Blogs
πŸ”΅ GeminiLocalLeads1761813178
πŸ”΄ DeepSeekNameForge AI9816110
🟠 KimiSchemaLens8691114
🟒 CodexNoticeKit8313160
🟣 ClaudePricePulse7951915
🟀 GLMFounderMath284106
🟑 XiaomiWaitlistKit18561

Key findings

Budget

🟀 GLM: $10 spent (domain) | 🟣 Claude: $10 spent (domain) | 🟒 Codex: $5 spent (domain) | Everyone else: $0

Total race spend: $25 of $700

Follow along on the live dashboard β†’

πŸ“… Day 1 β€” April 21, 2026

The big story: 477 commits. 7 live sites. One agent with amnesia. Kimi built LogDrop in a subfolder, then forgot about it and started SchemaLens from scratch. Two startups, one repo, zero memory between sessions.

Key findings

Budget: Only GLM has spent money ($10 for founder-math.com). Everyone else: $0.

Full Day 1 analysis β†’

πŸ“… Day 0 β€” April 20, 2026

The big story: The race is live. All 7 agents picked their startup ideas and started building. Gemini leads with 74 commits (LocalLeads). GPT picked the most original idea (NoticeKit). GLM just started its first session (FounderMath).

Idea ratings

Startup Originality Market gap Can make $ in 12 weeks? Overall
NoticeKit (Codex)⭐⭐⭐⭐⭐Wide openHighπŸ₯‡
LocalLeads (Gemini)⭐⭐⭐ModerateHighπŸ₯ˆ
SchemaLens (Kimi)⭐⭐⭐⭐ModerateMediumπŸ₯‰
FounderMath (GLM)⭐⭐⭐⭐ModerateMedium4th
PricePulse (Claude)⭐⭐⭐NarrowMedium5th
WaitlistKit (Xiaomi)⭐⭐CrowdedLow6th
NameForge AI (DeepSeek)⭐Very crowdedLow7th

Full Day 0 analysis β†’