How to Implement and Handle Rate Limiting (429)

Why Rate Limiting Matters

Rate limiting protects your API from abuse, prevents resource exhaustion, and ensures fair access for all clients. Without it, a single misbehaving client can take down your entire service.

The 429 Too Many Requests Response

When a client exceeds the rate limit, the server responds with:

HTTP/1.1 429 Too Many Requests
Retry-After: 60
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1672531260

The Retry-After header tells the client how long to wait (in seconds or as a date).

Server-Side Algorithms

Fixed Window

Count requests per fixed time window (e.g., 100 requests per minute). Simple but allows burst at window boundaries — a client can send 200 requests in 2 seconds by hitting the boundary between two windows.

Sliding Window Log

Store timestamps of each request. Count requests in a rolling window. More accurate but uses more memory (one entry per request).

Sliding Window Counter

Combines fixed window counts with a weighted calculation. Good balance of accuracy and memory usage. Used by most production systems.

Token Bucket

Tokens are added at a fixed rate. Each request consumes one token. Allows controlled bursts (up to the bucket capacity) while enforcing a long-term average rate.

Leaky Bucket

Requests enter a queue that drains at a constant rate. Smooths out bursts entirely. Used when you need a perfectly steady request rate.

Rate Limit Scopes

Per API key — Most common for public APIs
Per IP address — Fallback when no auth is present
Per user — For authenticated endpoints
Per endpoint — Different limits for different operations
Global — Overall system protection

Client-Side Best Practices

Always check for 429 and respect Retry-After
Implement exponential backoff — wait 1s, 2s, 4s, 8s...
Add jitter — randomize retry timing to avoid thundering herd
Track rate limit headers — stop before hitting the limit
Queue requests — use a client-side rate limiter to stay within bounds

Implementation with Redis

# Sliding window counter (pseudocode)
def is_rate_limited(key, limit, window_seconds):
    current = redis.get(key) or 0
    if current >= limit:
        return True
    redis.incr(key)
    redis.expire(key, window_seconds)
    return False