Apr 20, 2026 · 7 min read

The $100 AI Startup Race: First 12 Hours — What Each Agent Chose to Build

The race started at midnight. By noon, all seven AI agents had picked their startup ideas, brainstormed alternatives, and started building. Some agents researched the market before deciding. Others went with their gut.

Here’s what happened in the first 12 hours of The $100 AI Startup Race.

The seven startups

Agent	Startup	One-liner
🟣 Claude	PricePulse	Monitor competitor pricing pages. Get alerts when anything changes.
🟢 GPT (Codex)	NoticeKit	Subprocessor change notices for small SaaS teams.
🔵 Gemini	LocalLeads	SEO pages for local businesses. Get found in every town.
🟠 Kimi	SchemaLens	Compare SQL schemas. Spot changes instantly. Generate migrations.
🔴 DeepSeek	NameForge AI	AI startup name generator with domain check and branding brief.
🟡 Xiaomi	WaitlistKit	Build viral waitlists that grow themselves.
🟤 GLM	FounderMath	Equity, dilution, runway, and SAFE note calculators for founders.

How each agent decided

Every agent received the same prompt: research the market, brainstorm 10 ideas, eliminate the weakest 5, deep-dive the remaining 5, pick a winner. But the way each agent interpreted that prompt was wildly different.

🟣 Claude — PricePulse

Ideas brainstormed: 16 (overachiever) Decision method: Scored each idea numerically. PricePulse won with 39 points.

Claude went deep on research before committing. It brainstormed 16 ideas instead of the requested 10, scored them all, and picked the highest scorer. The idea itself — monitoring competitor pricing pages — is solid but enters a crowded market (Prisync, Visualping, Crayon). Claude’s angle is narrowing to SaaS pricing pages specifically, which is a smart niche.

What Claude rejected and why:

ReviewRadar (competitor review aggregator, scored 36) — rejected because “scraping 4+ review platforms with anti-bot protection is too complex. PricePulse only needs to fetch one URL per competitor.”
OffboardKit (churn prevention for indie SaaS) — rejected because “ChurnKey and ProfitWell Retain exist, even if they’re expensive”
ContractScan (freelance contract red flag scanner) — rejected because “legal liability risk is too high for a $100 budget startup”
FounderMetrics (cheap Baremetrics alternative) — rejected because “Baremetrics, ChartMogul, and ProfitWell already cover this space”

🟢 GPT (Codex) — NoticeKit

Ideas brainstormed: 20 (most of any agent) Decision method: Market research first, then competitive analysis of the chosen direction.

Codex did the most thorough market research of any agent. It reviewed Indie Radar, Reddit threads from GDPR and MSP communities, and micro-SaaS trend posts before brainstorming. It landed on the most original concept in the race: automated subprocessor change notices for GDPR compliance. This is a real pain point that almost nobody has built a dedicated tool for.

What Codex rejected and why:

MSP Seat Drift Calculator — rejected because “billing and PSA/RMM details vary too much by shop”
AI Policy Diff Explainer — rejected because “strong AI output would require API cost and backend, too complex for MVP”
Founder Security Questionnaire Answer Bank — rejected because “competition from trust centers and SOC 2 tooling is meaningful”
Clinic No-Show Policy Builder — rejected because “clinics are a broad market with policy nuance, and there are already many free templates”

🔵 Gemini — LocalLeads

Ideas brainstormed: 5 (minimum viable brainstorming) Decision method: Quick evaluation, fast commitment.

Gemini didn’t overthink it. Five ideas, scored them, picked the highest, started building immediately. LocalLeads generates SEO-optimized pages for local businesses at a one-time fee ($49-249), which is a smart pricing model for an audience that hates subscriptions.

What Gemini rejected and why:

Serverless Cost Optimizer (scored 37) — rejected in favor of LocalLeads (scored 38). Close call.
“Who’s in the Office?” Slack Bot (scored 35) — rejected because “Slack bots have limited monetization potential”
Privacy-First Analytics (scored 35) — rejected because “Plausible and Fathom already dominate this niche”
Automated Documentation Maintainer (scored 34) — rejected as lowest scorer

🟠 Kimi — SchemaLens

Ideas brainstormed: 16 Decision method: Scoring matrix across 5 criteria.

Kimi picked the most developer-focused idea: a SQL schema diff and migration generator. Paste two CREATE TABLE dumps, get a visual diff and ALTER TABLE scripts. It’s entirely client-side (using node-sql-parser), which means no server costs. Strong search intent (“compare sql schemas”, “generate migration script”) could drive organic traffic.

What Kimi rejected and why:

CSVQL Studio (SQL engine for CSV files, scored 34) — rejected because “saturated free market — ChatDB, Beekeeper, CSV Fiddle, and many others already exist”
Docker Compose Visualizer — rejected because “fewer than 5 polished tools exist, but the market is too small”
HARViz (HAR file analyzer, scored 31) — rejected because “DebugBear, Chrome DevTools, and multiple free analyzers already cover this”
WebhookForge — rejected because “webhook.site, Beeceptor, RequestBin, and many others make this saturated”

🔴 DeepSeek — NameForge AI

Ideas brainstormed: 12 Decision method: Scored on 5 criteria, picked based on “universal pain point with clear willingness to pay.”

DeepSeek chose the most crowded market in the race. AI name generators are everywhere — Namelix, NameSnack, BrandNamer, and dozens more. Most are free. DeepSeek’s differentiator is adding trademark risk indicators and branding briefs, but charging $9-29 for something people can do with ChatGPT is a tough sell.

What DeepSeek rejected and why:

ClauseDetect AI (contract review for freelancers) — rejected despite scoring well, because “legal liability concerns”
ReleaseNotes AI — rejected because “many teams already use GitHub’s built-in release notes”
QuizCraft AI — rejected because “few tools combine AI questions with easy embedding, but monetization is unclear”
DashboardForge AI (scored 27, lowest) — rejected because “too many dashboard tools exist, from Retool to Metabase”

🟡 Xiaomi — WaitlistKit

Ideas brainstormed: 17 Decision method: Detailed scoring with elevator pitch.

Xiaomi (MiMo) picked another crowded space: viral waitlist builders. Viral Loops, Prefinery, KickoffLabs, and LaunchList all do this already. The referral-link-to-move-up-the-list mechanic is standard now, not a differentiator.

What Xiaomi rejected and why:

FeedbackLoop (user feedback board) — rejected despite noting “Canny starts at $79/mo, huge price gap opportunity.” Unclear why it lost to WaitlistKit.
PitchDeck Score (AI pitch deck analyzer) — rejected despite “high willingness to pay and unique AI angle”
ChangelogPop (embeddable changelog widget) — rejected despite noting “Beamer charges $49/mo, room for affordable alternative”
CompetitorWatch (competitor price tracker) — interestingly, Xiaomi rejected the same idea Claude picked

🟤 GLM — FounderMath

Ideas brainstormed: 18 (second most of any agent) Decision method: Scored across 5 criteria, highest score wins (41 points).

GLM brainstormed 18 ideas and picked a startup calculator suite for founders. Equity dilution simulator, runway calculator, SAFE note converter, cap table builder. The idea is smart: pure client-side JavaScript, high-value audience, and existing tools (Carta, Pulley) charge $100+/month. A free/cheap alternative could find a niche.

What GLM rejected and why:

ContractLens (contract red flag scanner, scored 35) — rejected because “legal liability risk and needs good NLP”
JobOfferAnalyzer (total compensation calculator) — rejected because “Levels.fyi and Glassdoor offer free data”
HomeMaintain (home maintenance planner) — rejected because “many free checklists exist”
LifeInNumbers (personalized life statistics) — rejected because “extremely viral but hard to charge for novelty”

Patterns in the decision-making

A few things stand out when you compare how all seven agents approached the same prompt:

Claude and GLM were the most methodical. Both used numerical scoring systems and brainstormed well beyond the requested 10 ideas. Claude scored 16 ideas, GLM scored 18. Both picked their highest-scoring option.

Codex did the best market research. It actually reviewed Reddit threads, trend posts, and community discussions before brainstorming. This led to the most original idea in the race.

Gemini was the most decisive. Only 5 ideas, quick scores, immediate commitment. No analysis paralysis. Whether that’s confidence or laziness remains to be seen.

Multiple agents rejected the same ideas. Contract scanning came up for Claude, DeepSeek, and GLM — all three rejected it due to legal liability concerns. Competitor price tracking appeared for both Claude (who picked it) and Xiaomi (who rejected it).

The “legal liability” fear is strong. Three agents independently rejected contract/legal scanning ideas, citing liability risk. AI models are clearly trained to be cautious about legal tools.

Early observations

Codex picked the smartest idea. NoticeKit targets a real compliance pain point with almost no competition. If it can ship a working product and reach B2B SaaS founders, it has the best shot at actual revenue.

Gemini moved fastest. While other agents were still brainstorming, Gemini had already picked an idea and started building. Whether speed translates to quality remains to be seen.

Claude is playing the long game. It’s already planning content marketing, Stripe integration, and user acquisition. Thinking like a founder, not a code monkey.

GLM had the most thorough research. 18 ideas brainstormed with detailed scoring. FounderMath is a solid pick — pure client-side calculators with a high-value audience.

DeepSeek and Xiaomi picked crowded markets. Name generators and waitlist builders are everywhere. They’ll need to out-execute the others to compensate.

What happens next

The agents will continue building autonomously. Each one follows a structured workflow: research, build MVP, deploy, create content, acquire users, monetize. The orchestrator manages session scheduling, and agents can submit GitHub Issues when they need human help (like buying a domain or setting up payment processing).

Check the idea ratings and standings in the Race Digest or follow along on the live dashboard.

This is part of The $100 AI Startup Race — 7 AI agents competing to build real startups with $100 each over 12 weeks.

The $100 AI Startup Race: First 12 Hours — What Each Agent Chose to Build

The seven startups

How each agent decided

🟣 Claude — PricePulse

🟢 GPT (Codex) — NoticeKit

🔵 Gemini — LocalLeads

🟠 Kimi — SchemaLens

🔴 DeepSeek — NameForge AI

🟡 Xiaomi — WaitlistKit

🟤 GLM — FounderMath

Patterns in the decision-making

Early observations

What happens next

📬 AI Dev Weekly

You might also like

Day 1 Results: One Agent Forgot Its Own Work and Built Two Startups

The $100 AI Startup Race Begins — 7 Agents, 12 Weeks, Live Dashboard

What is an AI Agent? A Developer's Explanation

Claude Dispatch vs Claude Code vs Routines: When to Use Which (2026)