πŸ“š Learning Hub
Β· 2 min read

System Design Interview: Design a Real-Time Chat App β€” Step by Step


β€œDesign a real-time chat application” is one of the most common system design questions. Here’s how to structure your answer in 45 minutes.

Step 1: Clarify requirements (3 min)

Ask: β€œIs this 1:1 chat, group chat, or both?” β€œHow many concurrent users?” β€œDo we need message history?” β€œRead receipts? Typing indicators? File sharing?”

Assume: 1:1 and group chat, 10M DAU, message history, read receipts, no file sharing (keep scope manageable).

Step 2: High-level design (10 min)

Client ←→ WebSocket Gateway ←→ Chat Service ←→ Message Store
                                    ↓
                              Presence Service
                                    ↓
                              Push Notification Service

Key decisions:

  • WebSockets for real-time bidirectional communication (not HTTP polling)
  • Message store β€” Cassandra or DynamoDB for write-heavy workload
  • Presence β€” Redis with TTL keys for online/offline status
  • Push notifications β€” for offline users (FCM/APNs)

Step 3: Core components (15 min)

WebSocket Gateway

Maintains persistent connections. Routes messages to the correct recipient’s connection. Handles connection/disconnection events.

Scaling: Multiple gateway servers behind a load balancer. Use Redis Pub/Sub so Gateway A can send a message to a user connected to Gateway B.

Message Service

POST /messages β†’ { chatId, senderId, content, timestamp }
GET /messages?chatId=X&before=timestamp β†’ paginated history

Message ID: Use time-ordered UUIDs (like Snowflake IDs) so messages are naturally sorted.

Storage schema:

messages: { message_id, chat_id, sender_id, content, timestamp, status }
Partition key: chat_id
Sort key: timestamp

Presence Service

Redis key per user: presence:user123 = "online" with 30-second TTL. Client sends heartbeat every 20 seconds to refresh TTL. When TTL expires, user is offline.

Group chat

Group messages fan out to all members. For a group of 500 members, the gateway sends 500 WebSocket messages. For very large groups, use a message queue (Kafka) to handle fan-out asynchronously.

Step 4: Deep dive (10 min)

Message ordering: Use server timestamps, not client timestamps. Within a chat, messages are ordered by the message store’s sort key.

Read receipts: When user opens a chat, send { chatId, lastReadMessageId } to the server. Store per-user read position. Other users query this to show β€œread” status.

Offline messages: When a user comes online, query all messages with timestamp > last_seen. Push notifications for messages received while offline.

Step 5: Scaling (5 min)

  • 10M DAU β†’ ~1M concurrent WebSocket connections β†’ 50-100 gateway servers
  • Message throughput β€” Cassandra handles 100K+ writes/second per node
  • Presence β€” Redis cluster handles millions of keys with TTL
  • Geographic distribution β€” deploy gateways in multiple regions, route users to nearest

Common follow-ups

  • β€œHow do you handle message delivery guarantees?” β†’ At-least-once with client-side dedup by message ID
  • β€œHow do you handle 10K-member groups?” β†’ Fan-out on write to a message queue, not synchronous WebSocket sends
  • β€œHow do you store and search message history?” β†’ Elasticsearch index alongside the primary store

Related: System Design: URL Shortener Β· How WebSockets Actually Work