System Design Interview: Design a Real-Time Chat App β Step by Step
βDesign a real-time chat applicationβ is one of the most common system design questions. Hereβs how to structure your answer in 45 minutes.
Step 1: Clarify requirements (3 min)
Ask: βIs this 1:1 chat, group chat, or both?β βHow many concurrent users?β βDo we need message history?β βRead receipts? Typing indicators? File sharing?β
Assume: 1:1 and group chat, 10M DAU, message history, read receipts, no file sharing (keep scope manageable).
Step 2: High-level design (10 min)
Client ββ WebSocket Gateway ββ Chat Service ββ Message Store
β
Presence Service
β
Push Notification Service
Key decisions:
- WebSockets for real-time bidirectional communication (not HTTP polling)
- Message store β Cassandra or DynamoDB for write-heavy workload
- Presence β Redis with TTL keys for online/offline status
- Push notifications β for offline users (FCM/APNs)
Step 3: Core components (15 min)
WebSocket Gateway
Maintains persistent connections. Routes messages to the correct recipientβs connection. Handles connection/disconnection events.
Scaling: Multiple gateway servers behind a load balancer. Use Redis Pub/Sub so Gateway A can send a message to a user connected to Gateway B.
Message Service
POST /messages β { chatId, senderId, content, timestamp }
GET /messages?chatId=X&before=timestamp β paginated history
Message ID: Use time-ordered UUIDs (like Snowflake IDs) so messages are naturally sorted.
Storage schema:
messages: { message_id, chat_id, sender_id, content, timestamp, status }
Partition key: chat_id
Sort key: timestamp
Presence Service
Redis key per user: presence:user123 = "online" with 30-second TTL. Client sends heartbeat every 20 seconds to refresh TTL. When TTL expires, user is offline.
Group chat
Group messages fan out to all members. For a group of 500 members, the gateway sends 500 WebSocket messages. For very large groups, use a message queue (Kafka) to handle fan-out asynchronously.
Step 4: Deep dive (10 min)
Message ordering: Use server timestamps, not client timestamps. Within a chat, messages are ordered by the message storeβs sort key.
Read receipts: When user opens a chat, send { chatId, lastReadMessageId } to the server. Store per-user read position. Other users query this to show βreadβ status.
Offline messages: When a user comes online, query all messages with timestamp > last_seen. Push notifications for messages received while offline.
Step 5: Scaling (5 min)
- 10M DAU β ~1M concurrent WebSocket connections β 50-100 gateway servers
- Message throughput β Cassandra handles 100K+ writes/second per node
- Presence β Redis cluster handles millions of keys with TTL
- Geographic distribution β deploy gateways in multiple regions, route users to nearest
Common follow-ups
- βHow do you handle message delivery guarantees?β β At-least-once with client-side dedup by message ID
- βHow do you handle 10K-member groups?β β Fan-out on write to a message queue, not synchronous WebSocket sends
- βHow do you store and search message history?β β Elasticsearch index alongside the primary store
Related: System Design: URL Shortener Β· How WebSockets Actually Work