πŸ“š Learning Hub
Β· 3 min read

How WebSockets Actually Work Behind the Scenes


You use WebSockets for real-time features β€” chat, notifications, live updates. But what actually happens when you call new WebSocket()?

It Starts as HTTP

A WebSocket connection begins as a regular HTTP request. Your browser sends a GET request with special headers:

GET /chat HTTP/1.1
Host: example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13

The Upgrade: websocket header is the key part. It tells the server: β€œI want to switch protocols.”

The Handshake

The server responds with a 101 Switching Protocols status:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

That Sec-WebSocket-Accept value is computed from the client’s Sec-WebSocket-Key plus a fixed GUID, then base64-encoded. This proves the server actually understands WebSocket β€” it’s not just some random HTTP server accidentally accepting the upgrade.

After this handshake, the HTTP connection is gone. The same TCP socket is now a WebSocket connection. No more request/response β€” both sides can send data whenever they want.

How Data Flows: Frames

WebSocket data is sent in frames, not raw bytes. Each frame has a structure:

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len |    Extended payload length    |
|I|S|S|S|  (4)  |A|     (7)     |             (16/64)           |
|N|V|V|V|       |S|             |   (if payload len==126/127)   |
| |1|2|3|       |K|             |                               |
+-+-+-+-+-------+-+-------------+-------------------------------+
|     Masking-key (if MASK set)     |          Payload          |
+-----------------------------------+---------------------------+

The important bits:

  • FIN bit: Is this the last frame of a message? Large messages get split across multiple frames.
  • Opcode: What type of frame? 0x1 = text, 0x2 = binary, 0x8 = close, 0x9 = ping, 0xA = pong.
  • MASK bit: Client-to-server frames must be masked. Server-to-client frames must not. This prevents cache poisoning attacks on proxies.
  • Payload length: 7 bits for small messages (up to 125 bytes), extended to 16 or 64 bits for larger ones.

Ping/Pong: Keeping Connections Alive

WebSocket connections can sit idle for minutes. Firewalls and load balancers might kill idle TCP connections. Ping/pong frames solve this.

The server sends a ping frame. The client must respond with a pong frame containing the same payload. If the pong doesn’t come back, the server knows the client is gone.

// Most libraries handle this automatically
// But here's what happens under the hood:
// Server sends: opcode 0x9 (ping), payload: "heartbeat"
// Client sends: opcode 0xA (pong), payload: "heartbeat"

This happens every 30-60 seconds on most implementations.

Why Not Just Use HTTP Polling?

With HTTP polling, the client asks β€œany new data?” every N seconds. Each request is a full HTTP request with headers, cookies, and connection overhead.

With WebSockets:

  • No overhead per message β€” after the handshake, frames are tiny (2-14 bytes of overhead vs 200+ bytes for HTTP headers)
  • Server can push β€” no waiting for the client to ask
  • Single TCP connection β€” no opening/closing connections repeatedly

For a chat app with 1000 users, HTTP polling means 1000 requests per second (if polling every second). WebSockets means 1000 persistent connections that only send data when there’s actually something to send.

The Close Handshake

Closing a WebSocket is also a handshake. One side sends a close frame (opcode 0x8) with a status code. The other side responds with its own close frame. Then the TCP connection closes.

// Status codes you'll see:
// 1000 - Normal closure
// 1001 - Going away (page navigation, server shutdown)
// 1006 - Abnormal closure (no close frame received)
// 1011 - Server error

Common Gotchas

Proxies can break WebSockets. Nginx needs explicit configuration to proxy WebSocket connections. Without proxy_set_header Upgrade and proxy_set_header Connection, the upgrade handshake fails silently.

WebSockets don’t auto-reconnect. If the connection drops, your client code needs to handle reconnection. Most libraries (Socket.IO, ws) do this for you, but raw WebSocket API doesn’t.

They bypass CORS. WebSocket connections aren’t subject to the same-origin policy. The server must validate the Origin header manually if it cares about which domains can connect.

Now you know what’s actually happening when your chat messages appear in real-time.

Related: What is WebSockets? Β· How CORS Actually Works Β· System Design: Real-Time Chat App