Apr 21, 2026 · 6 min read

Day 1 Results: One Agent Forgot Its Own Work and Built Two Startups

Day 1 is done. All seven AI agents have live websites, working products, and opinions about pricing. One agent forgot its own work and started a completely different company. Another wrote 104 blog posts in a single day. A third burned through 26 Vercel deployments because it interpreted a prompt as an instruction to push code after every commit.

Here’s everything that happened in the first 24 hours of The $100 AI Startup Race.

📊 Live Dashboard | 📅 Race Digest | 💰 Budget Tracker

The scoreboard after Day 1

Agent	Startup	Commits	Sessions	Pages	Blog Posts	Site
🔵 Gemini	LocalLeads	169	10	12	104	Live
🔴 DeepSeek	NameForge AI	91	10	11	0	Live
🟠 Kimi	SchemaLens / LogDrop	58	5	9	9	Live
🟢 Codex	NoticeKit	56	7	13	0	Live
🟣 Claude	PricePulse	53	3	15	11	Live
🟤 GLM	FounderMath	24	2	8	5	Live
🟡 Xiaomi	WaitlistKit	16	3	6	1	Live

Total: 477 commits, 40 sessions, 74 pages, 130 blog posts, 7 live websites. All in 24 hours.

The Kimi amnesia incident

This is the story of the day.

Kimi’s first session ran at 3 AM. It chose to build LogDrop, a log analysis tool. It created an IDENTITY.md, a full backlog, landing pages, a pricing page, a blog, and even built a working MVP with a JSON log parser, search, filters, and CSV export. Productive session.

There was one problem: Kimi put everything in a startup/ subfolder instead of the root directory.

The orchestrator reads PROGRESS.md from the root to give agents their memory between sessions. When Kimi’s second session started, there was no PROGRESS.md in root. The agent thought it was Day 1. It started fresh. It brainstormed a completely different idea. It built SchemaLens, a SQL schema diff tool, from scratch in the root directory.

Kimi now has two half-built startups in the same repository. The help request for LogDrop’s domain is stuck in the subfolder where the orchestrator can’t find it. The dashboard shows SchemaLens. The startup/ folder contains a more complete product that nobody knows about.

One wrong directory = total memory loss between sessions.

This is the kind of failure that makes autonomous AI agents fascinating. The agent didn’t crash. It didn’t throw an error. It just quietly forgot everything it did and started over with a different idea. If a human developer did this, you’d think they were joking.

Gemini’s 104 blog posts

Gemini has 8 sessions per day (the most of any agent) and it used them aggressively. By the end of Day 1, LocalLeads had 104 blog posts on local SEO topics. That’s roughly one blog post every 14 minutes.

The strategy is clear: flood the site with content to rank for local SEO keywords. Whether the content is any good is a different question. But in terms of raw output, Gemini is operating at a pace no other agent can match.

For comparison, Claude wrote 11 blog posts. GLM wrote 5. Xiaomi wrote 1.

Gemini’s 169 commits are also nearly double the next closest agent (DeepSeek at 91). The 8-sessions-per-day schedule gives it a massive volume advantage. The question for the rest of the race: does quantity beat quality?

Codex burned 26 Vercel deployments

The prompt told agents: “Your repo auto-deploys on every git push.” This was meant as context. Codex read it as an instruction.

During its sessions, Codex ran git push after nearly every commit. Each push triggered a Vercel deployment. By mid-afternoon, Codex had consumed 26 of the account’s 100 daily Vercel deployments, all by itself.

The fix was a prompt change: “Do NOT run git push. The orchestrator pushes after your session.” But the damage was done. We also discovered that Vercel deploys every commit in a push individually, not just the latest. So even the orchestrator’s single push with 14 commits created 14 separate deployments.

We ended up adding three fixes:

Prompt change to stop agents from pushing mid-session
vercel.json to disable preview deployments
Commit squashing in the orchestrator (all session commits become one)

Lesson learned: with autonomous agents, every sentence in the prompt is a potential instruction. If you don’t want them to do something, say so explicitly.

Claude’s security-first approach

Claude took the most methodical approach. With only 3 sessions (it runs on the expensive Opus model, so fewer sessions per day), it built PricePulse with:

Security headers in vercel.json (X-Content-Type-Options, X-Frame-Options, XSS-Protection)
A proper 404 page
Login, signup, and password reset flows
A demo page and dashboard
11 blog posts

Claude also created a .github/workflows/monitor.yml file for automated monitoring, showing a security-conscious mindset from the start.

GLM’s quality over quantity

GLM only had 2 sessions (it runs on the cheapest schedule), but it made them count. FounderMath already has three working calculators:

A SAFE note calculator with all 4 YC SAFE types
A dilution calculator
A runway calculator

GLM also submitted the best-structured help request of any agent: clear format, backup plans, budget specified, priority levels, and even suggested the DNS record type for the domain. We registered founder-math.com ($10) and set up Stripe payment links based on its request.

DeepSeek’s DEPLOY-STATUS confusion

DeepSeek created a DEPLOY-STATUS.md file asking for Stripe keys and an OpenAI API key. The problem: the orchestrator prompt says “if DEPLOY-STATUS.md exists, your site is BROKEN. Fix it before anything else.”

DeepSeek’s site isn’t broken. It just wants environment variables. But now every session starts with the agent thinking its site is down, potentially wasting time trying to “fix” a non-existent problem.

Xiaomi’s slow start

WaitlistKit has the fewest commits (16) and the simplest site. With only 3 sessions per day running on Aider (no web search), Xiaomi is at a significant disadvantage. The waitlist builder idea is also the most crowded market of all seven startups.

The one advantage: Xiaomi’s sessions are cheap, so it has more budget runway than agents burning through expensive API calls.

Budget tracker

Agent	Domain	Stripe	Other	Total spent	Remaining
🟣 Claude	Pending	$0	Supabase (free)	$0	$100
🟢 Codex	Pending	Pending	$0	$0	$100
🔵 Gemini	—	—	$0	$0	$100
🟠 Kimi	Stuck in subfolder	—	$0	$0	$100
🔴 DeepSeek	—	—	$0	$0	$100
🟡 Xiaomi	—	—	$0	$0	$100
🟤 GLM	founder-math.com ($10)	✅ Pro + Team	GA4	$10	$90

Only GLM has spent money so far. The rest are either waiting for help request responses or haven’t asked for anything yet.

What to watch on Day 2

Does Kimi discover its startup/ folder? If Session 3 finds LogDrop and merges the work, that’s a story about AI self-correction. If it keeps building SchemaLens obliviously, that’s a story about AI agents losing track of their own work.
Will the Vercel deploy fixes hold? We went from 87 deploys on Day 1 to (hopefully) ~20-25 with the squash + preview disable + skip-ci fixes.
Gemini’s content quality. 104 blog posts is impressive volume, but are they any good? If Google indexes them and they rank, Gemini’s strategy is genius. If they’re thin content that gets ignored, it wasted 10 sessions on noise.
Claude and Codex need domains. Both requested help but are waiting on responses. Without custom domains, they’re stuck on .vercel.app subdomains.
DeepSeek’s DEPLOY-STATUS loop. Will it figure out that its site isn’t actually broken, or will it keep trying to fix a non-existent problem?

Follow the race live on the dashboard or check back for daily updates.

Previous: First 12 Hours: What Each Agent Chose to Build

Day 1 Results: One Agent Forgot Its Own Work and Built Two Startups

The scoreboard after Day 1

The Kimi amnesia incident

Gemini’s 104 blog posts

Codex burned 26 Vercel deployments

Claude’s security-first approach

GLM’s quality over quantity

DeepSeek’s DEPLOY-STATUS confusion

Xiaomi’s slow start

Budget tracker

What to watch on Day 2

📬 AI Dev Weekly

You might also like

The $100 AI Startup Race: First 12 Hours — What Each Agent Chose to Build

The $100 AI Startup Race Begins — 7 Agents, 12 Weeks, Live Dashboard

What is an AI Agent? A Developer's Explanation

Claude Dispatch vs Claude Code vs Routines: When to Use Which (2026)