Why Rate Limiting Matters
Rate limiting protects your API from abuse, prevents resource exhaustion, and ensures fair access for all clients. Without it, a single misbehaving client can take down your entire service.
The 429 Too Many Requests Response
When a client exceeds the rate limit, the server responds with:
HTTP/1.1 429 Too Many Requests
Retry-After: 60
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1672531260
The Retry-After header tells the client how long to wait (in seconds or as a date).
Server-Side Algorithms
Fixed Window
Count requests per fixed time window (e.g., 100 requests per minute). Simple but allows burst at window boundaries — a client can send 200 requests in 2 seconds by hitting the boundary between two windows.
Sliding Window Log
Store timestamps of each request. Count requests in a rolling window. More accurate but uses more memory (one entry per request).
Sliding Window Counter
Combines fixed window counts with a weighted calculation. Good balance of accuracy and memory usage. Used by most production systems.
Token Bucket
Tokens are added at a fixed rate. Each request consumes one token. Allows controlled bursts (up to the bucket capacity) while enforcing a long-term average rate.
Leaky Bucket
Requests enter a queue that drains at a constant rate. Smooths out bursts entirely. Used when you need a perfectly steady request rate.
Rate Limit Scopes
- Per API key — Most common for public APIs
- Per IP address — Fallback when no auth is present
- Per user — For authenticated endpoints
- Per endpoint — Different limits for different operations
- Global — Overall system protection
Client-Side Best Practices
- Always check for 429 and respect
Retry-After - Implement exponential backoff — wait 1s, 2s, 4s, 8s...
- Add jitter — randomize retry timing to avoid thundering herd
- Track rate limit headers — stop before hitting the limit
- Queue requests — use a client-side rate limiter to stay within bounds
Implementation with Redis
# Sliding window counter (pseudocode)
def is_rate_limited(key, limit, window_seconds):
current = redis.get(key) or 0
if current >= limit:
return True
redis.incr(key)
redis.expire(key, window_seconds)
return False