Why WebSocket?
Before WebSocket (RFC 6455, 2011), real-time web applications were built on hacks: long-polling (HTTP requests held open until data arrives) or Server-Sent Events (one-directional server push). Both add latency and overhead.
WebSocket provides a single, persistent, full-duplex channel over TCP, initiated via HTTP. Once the connection is established:
- Either party can send data at any time
- No HTTP headers on each message (framing overhead is 2–10 bytes)
- Latency is bounded only by network RTT
The Opening Handshake
WebSocket starts as an HTTP/1.1 request with an Upgrade header. The server validates the request and responds with 101 Switching Protocols, after which the TCP connection is handed off to the WebSocket protocol.
Client request:
GET /chat HTTP/1.1
Host: example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Extensions: permessage-deflate
Server response:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Sec-WebSocket-Protocol: chat
Sec-WebSocket-Accept calculation (RFC 6455 Section 1.3):
import base64, hashlib
MAGIC = '258EAFA5-E914-47DA-95CA-C5AB0DC85B11'
key = 'dGhlIHNhbXBsZSBub25jZQ=='
accept = base64.b64encode(
hashlib.sha1((key + MAGIC).encode()).digest()
).decode()
# 's3pPLMBiTxaQ9kYGzzhZRbK+xOo='
This handshake prevents HTTP servers from accidentally treating a WebSocket connection as a regular HTTP request — the Accept header cannot be generated without knowing the WebSocket magic string.
Frame Format
After the handshake, all data is sent as WebSocket frames:
Bit: 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len | Extended payload length |
|I|S|S|S| (4) |A| (7) | (if 16 or 64 bit) |
|N|V|V|V| |S| | |
| |1|2|3| |K| | |
+-+-+-+-+-------+-+-------------+-------------------------------+
Opcodes:
| Opcode | Hex | Meaning |
|---|---|---|
| Continuation | 0x0 | Fragment continuation |
| Text | 0x1 | UTF-8 text data |
| Binary | 0x2 | Binary data |
| Close | 0x8 | Close the connection |
| Ping | 0x9 | Keepalive ping |
| Pong | 0xA | Ping response |
Control Frames: Close, Ping, Pong
Control frames (opcodes 0x8–0xF) are always unFragmented and have payloads of at most 125 bytes.
Close (0x8): Contains a 2-byte status code and optional reason string. Both parties must send a Close frame to initiate graceful shutdown. Common close codes:
1000Normal Closure1001Going Away (server restart, navigation)1002Protocol Error1008Policy Violation1011Internal Server Error
Ping (0x9) / Pong (0xA): Either party may send a Ping at any time. The receiver must respond with a Pong containing the same payload data. Pongs may also be sent unsolicited (e.g., latency measurement). Unanswered Pings within a timeout indicate a dead connection.
Data Framing and Masking
RFC 6455 mandates that all frames sent from client to server must be masked using a 4-byte masking key. Server-to-client frames are never masked.
# Masking algorithm (XOR with cycling key)
def mask(payload: bytes, key: bytes) -> bytes:
return bytes(b ^ key[i % 4] for i, b in enumerate(payload))
Masking prevents cache poisoning attacks on transparent proxies that might misinterpret WebSocket frames as HTTP responses. It is a protocol security requirement, not encryption.
Large messages may be split into fragments — the first frame has FIN=0, intermediate frames use opcode 0x0 (continuation), and the final fragment has FIN=1. Control frames may be interspersed between fragments.
Closing the Connection
RFC 6455 Section 7 defines a two-step closing handshake:
- The initiating party sends a Close frame (opcode 0x8)
- The receiver sends a Close frame in response
- Both parties close the underlying TCP connection
The initiating side should not send further data after sending Close, but must continue receiving until it gets the responding Close frame. If the peer does not respond within a timeout, the TCP connection may be closed unilaterally.
Security Considerations
- Use
wss://(WebSocket Secure) — plainws://exposes all data to eavesdropping and injection. TLS is mandatory for production. - Validate the
Originheader during the handshake to prevent cross-site WebSocket hijacking (CSWSH). Reject unexpected origins. - Authenticate before upgrade — pass tokens via URL parameters or the initial HTTP headers during the handshake. WebSocket frames do not carry HTTP cookies after upgrade.
- Rate-limit messages server-side — the absence of HTTP overhead makes WebSocket a potential DoS vector if a client sends thousands of frames per second.