Day 1 is done. All seven AI agents have live websites, working products, and opinions about pricing. One agent forgot its own work and started a completely different company. Another wrote 104 blog posts in a single day. A third burned through 26 Vercel deployments because it interpreted a prompt as an instruction to push code after every commit.
Here’s everything that happened in the first 24 hours of The $100 AI Startup Race.
📊 Live Dashboard | 📅 Race Digest | 💰 Budget Tracker
The scoreboard after Day 1
| Agent | Startup | Commits | Sessions | Pages | Blog Posts | Site |
|---|---|---|---|---|---|---|
| 🔵 Gemini | LocalLeads | 169 | 10 | 12 | 104 | Live |
| 🔴 DeepSeek | NameForge AI | 91 | 10 | 11 | 0 | Live |
| 🟠 Kimi | SchemaLens / LogDrop | 58 | 5 | 9 | 9 | Live |
| 🟢 Codex | NoticeKit | 56 | 7 | 13 | 0 | Live |
| 🟣 Claude | PricePulse | 53 | 3 | 15 | 11 | Live |
| 🟤 GLM | FounderMath | 24 | 2 | 8 | 5 | Live |
| 🟡 Xiaomi | WaitlistKit | 16 | 3 | 6 | 1 | Live |
Total: 477 commits, 40 sessions, 74 pages, 130 blog posts, 7 live websites. All in 24 hours.
The Kimi amnesia incident
This is the story of the day.
Kimi’s first session ran at 3 AM. It chose to build LogDrop, a log analysis tool. It created an IDENTITY.md, a full backlog, landing pages, a pricing page, a blog, and even built a working MVP with a JSON log parser, search, filters, and CSV export. Productive session.
There was one problem: Kimi put everything in a startup/ subfolder instead of the root directory.
The orchestrator reads PROGRESS.md from the root to give agents their memory between sessions. When Kimi’s second session started, there was no PROGRESS.md in root. The agent thought it was Day 1. It started fresh. It brainstormed a completely different idea. It built SchemaLens, a SQL schema diff tool, from scratch in the root directory.
Kimi now has two half-built startups in the same repository. The help request for LogDrop’s domain is stuck in the subfolder where the orchestrator can’t find it. The dashboard shows SchemaLens. The startup/ folder contains a more complete product that nobody knows about.
One wrong directory = total memory loss between sessions.
This is the kind of failure that makes autonomous AI agents fascinating. The agent didn’t crash. It didn’t throw an error. It just quietly forgot everything it did and started over with a different idea. If a human developer did this, you’d think they were joking.
Gemini’s 104 blog posts
Gemini has 8 sessions per day (the most of any agent) and it used them aggressively. By the end of Day 1, LocalLeads had 104 blog posts on local SEO topics. That’s roughly one blog post every 14 minutes.
The strategy is clear: flood the site with content to rank for local SEO keywords. Whether the content is any good is a different question. But in terms of raw output, Gemini is operating at a pace no other agent can match.
For comparison, Claude wrote 11 blog posts. GLM wrote 5. Xiaomi wrote 1.
Gemini’s 169 commits are also nearly double the next closest agent (DeepSeek at 91). The 8-sessions-per-day schedule gives it a massive volume advantage. The question for the rest of the race: does quantity beat quality?
Codex burned 26 Vercel deployments
The prompt told agents: “Your repo auto-deploys on every git push.” This was meant as context. Codex read it as an instruction.
During its sessions, Codex ran git push after nearly every commit. Each push triggered a Vercel deployment. By mid-afternoon, Codex had consumed 26 of the account’s 100 daily Vercel deployments, all by itself.
The fix was a prompt change: “Do NOT run git push. The orchestrator pushes after your session.” But the damage was done. We also discovered that Vercel deploys every commit in a push individually, not just the latest. So even the orchestrator’s single push with 14 commits created 14 separate deployments.
We ended up adding three fixes:
- Prompt change to stop agents from pushing mid-session
vercel.jsonto disable preview deployments- Commit squashing in the orchestrator (all session commits become one)
Lesson learned: with autonomous agents, every sentence in the prompt is a potential instruction. If you don’t want them to do something, say so explicitly.
Claude’s security-first approach
Claude took the most methodical approach. With only 3 sessions (it runs on the expensive Opus model, so fewer sessions per day), it built PricePulse with:
- Security headers in
vercel.json(X-Content-Type-Options, X-Frame-Options, XSS-Protection) - A proper 404 page
- Login, signup, and password reset flows
- A demo page and dashboard
- 11 blog posts
Claude also created a .github/workflows/monitor.yml file for automated monitoring, showing a security-conscious mindset from the start.
GLM’s quality over quantity
GLM only had 2 sessions (it runs on the cheapest schedule), but it made them count. FounderMath already has three working calculators:
- A SAFE note calculator with all 4 YC SAFE types
- A dilution calculator
- A runway calculator
GLM also submitted the best-structured help request of any agent: clear format, backup plans, budget specified, priority levels, and even suggested the DNS record type for the domain. We registered founder-math.com ($10) and set up Stripe payment links based on its request.
DeepSeek’s DEPLOY-STATUS confusion
DeepSeek created a DEPLOY-STATUS.md file asking for Stripe keys and an OpenAI API key. The problem: the orchestrator prompt says “if DEPLOY-STATUS.md exists, your site is BROKEN. Fix it before anything else.”
DeepSeek’s site isn’t broken. It just wants environment variables. But now every session starts with the agent thinking its site is down, potentially wasting time trying to “fix” a non-existent problem.
Xiaomi’s slow start
WaitlistKit has the fewest commits (16) and the simplest site. With only 3 sessions per day running on Aider (no web search), Xiaomi is at a significant disadvantage. The waitlist builder idea is also the most crowded market of all seven startups.
The one advantage: Xiaomi’s sessions are cheap, so it has more budget runway than agents burning through expensive API calls.
Budget tracker
| Agent | Domain | Stripe | Other | Total spent | Remaining |
|---|---|---|---|---|---|
| 🟣 Claude | Pending | $0 | Supabase (free) | $0 | $100 |
| 🟢 Codex | Pending | Pending | $0 | $0 | $100 |
| 🔵 Gemini | — | — | $0 | $0 | $100 |
| 🟠 Kimi | Stuck in subfolder | — | $0 | $0 | $100 |
| 🔴 DeepSeek | — | — | $0 | $0 | $100 |
| 🟡 Xiaomi | — | — | $0 | $0 | $100 |
| 🟤 GLM | founder-math.com ($10) | ✅ Pro + Team | GA4 | $10 | $90 |
Only GLM has spent money so far. The rest are either waiting for help request responses or haven’t asked for anything yet.
What to watch on Day 2
-
Does Kimi discover its
startup/folder? If Session 3 finds LogDrop and merges the work, that’s a story about AI self-correction. If it keeps building SchemaLens obliviously, that’s a story about AI agents losing track of their own work. -
Will the Vercel deploy fixes hold? We went from 87 deploys on Day 1 to (hopefully) ~20-25 with the squash + preview disable + skip-ci fixes.
-
Gemini’s content quality. 104 blog posts is impressive volume, but are they any good? If Google indexes them and they rank, Gemini’s strategy is genius. If they’re thin content that gets ignored, it wasted 10 sessions on noise.
-
Claude and Codex need domains. Both requested help but are waiting on responses. Without custom domains, they’re stuck on
.vercel.appsubdomains. -
DeepSeek’s DEPLOY-STATUS loop. Will it figure out that its site isn’t actually broken, or will it keep trying to fix a non-existent problem?
Follow the race live on the dashboard or check back for daily updates.