πŸ“š Learning Hub
Β· 4 min read

System Design Interview: Design a URL Shortener β€” Step by Step


The URL shortener is the most common system design interview question. Here’s how to nail it in 30 minutes.

Step 1: Clarify requirements (2 minutes)

Always ask before designing. Here’s what to establish:

Functional:

  • Shorten a long URL β†’ return short URL
  • Redirect short URL β†’ original long URL
  • Custom aliases? (nice to have)
  • Expiration? (nice to have)
  • Analytics? (nice to have)

Non-functional:

  • How many URLs per day? β†’ Let’s say 100M reads, 1M writes
  • How short should URLs be? β†’ As short as possible
  • Availability vs consistency? β†’ High availability, eventual consistency is fine

Step 2: Back-of-envelope math (2 minutes)

Storage:

  • 1M new URLs/day Γ— 365 days Γ— 5 years = ~1.8 billion URLs
  • Each URL: ~500 bytes (original URL + metadata)
  • Total: ~900 GB over 5 years β†’ fits on one machine, but we’ll distribute for availability

Short URL length:

  • Using base62 (a-z, A-Z, 0-9): 62 characters
  • 7 characters: 62^7 = 3.5 trillion combinations β†’ more than enough

Throughput:

  • Reads: 100M/day = ~1,200/second
  • Writes: 1M/day = ~12/second
  • Read-heavy system (100:1 ratio)

Step 3: High-level design (5 minutes)

Client β†’ Load Balancer β†’ API Servers β†’ Cache (Redis) β†’ Database

Two main APIs:

POST /api/shorten
  Body: { "url": "https://very-long-url.com/..." }
  Response: { "short_url": "https://short.ly/abc1234" }

GET /:shortCode
  Response: 301 Redirect to original URL

Step 4: Short code generation (5 minutes)

Three approaches:

Option A: Hash + truncate

MD5("https://long-url.com") β†’ "5d41402abc4b2a76" β†’ take first 7 chars β†’ "5d41402"

Problem: collisions. Two different URLs could produce the same 7 chars. Fix: check for collision, append counter if needed.

Use an auto-incrementing counter, convert to base62:

Counter: 1000000
Base62:  "4c92"

No collisions guaranteed. But need a distributed counter for multiple servers.

Option C: Pre-generated keys

Generate all possible 7-char codes in advance, store in a key pool. Each server grabs a batch.

Best approach for interview: Counter-based with a distributed ID generator (like Twitter’s Snowflake or a dedicated key generation service).

Step 5: Database design (5 minutes)

CREATE TABLE urls (
  id BIGINT PRIMARY KEY,
  short_code VARCHAR(7) UNIQUE NOT NULL,
  original_url TEXT NOT NULL,
  created_at TIMESTAMP DEFAULT NOW(),
  expires_at TIMESTAMP,
  click_count BIGINT DEFAULT 0
);

CREATE INDEX idx_short_code ON urls (short_code);

SQL vs NoSQL?

  • Read-heavy, simple key-value lookups β†’ NoSQL (DynamoDB, Cassandra) works great
  • But SQL (PostgreSQL) works fine at this scale too
  • For the interview: mention both, pick one and justify it

Step 6: Caching (3 minutes)

100:1 read-to-write ratio β†’ caching is critical.

Read flow:
1. Check Redis cache for short_code
2. If hit β†’ return URL (fast)
3. If miss β†’ query database β†’ store in Redis β†’ return URL

Cache eviction: LRU (Least Recently Used). Popular URLs stay cached.

Cache size: If 20% of URLs get 80% of traffic, cache the top 20%.

  • 1.8B URLs Γ— 20% Γ— 500 bytes = ~180 GB β†’ fits in a Redis cluster

Step 7: Scaling (5 minutes)

Read scaling:

  • Multiple API servers behind a load balancer
  • Redis cluster for caching
  • Database read replicas

Write scaling:

  • Key generation service distributes ID ranges to API servers
  • Each server generates codes independently (no coordination needed)
  • Database sharding by short_code hash if needed

Availability:

  • Multiple data centers
  • Database replication across regions
  • If one region goes down, DNS routes to another

Step 8: Additional features (3 minutes)

Analytics:

  • Log each redirect to a message queue (Kafka)
  • Process asynchronously β†’ store in analytics database
  • Don’t slow down redirects for analytics

Expiration:

  • Background job scans for expired URLs
  • Or: check expiry on read and return 404

Rate limiting:

  • Prevent abuse of the creation endpoint
  • Token bucket per API key

The diagram to draw

                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚ Load Balancerβ”‚
                    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚            β”‚            β”‚
         β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”
         β”‚API Srv 1β”‚  β”‚API Srv 2β”‚  β”‚API Srv 3β”‚
         β””β”€β”€β”€β”€β”¬β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”˜
              β”‚            β”‚            β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”
                    β”‚  Redis Cache β”‚
                    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”
                    β”‚  Database    β”‚
                    β”‚  (Primary)   β”‚
                    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”
                    β”‚  Read Replicasβ”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Common follow-up questions

β€œHow do you handle a server going down?” Load balancer health checks remove it. Stateless servers, so any server can handle any request.

β€œWhat if the database goes down?” Reads served from cache. Writes queued. Failover to replica promoted to primary.

β€œHow do you prevent the same URL from being shortened twice?” Check if original_url exists before creating. Use a hash index on original_url for fast lookups.

This question tests whether you can think about scale, trade-offs, and failure modes β€” not whether you can build a URL shortener.

Related: System Design Notification System Β· Behavioral Interview Developer