Debugging 429 Too Many Requests

What Triggers 429?

429 Too Many Requests is returned when a client exceeds the rate limit set by the server. Rate limits exist to protect server resources, ensure fair usage among clients, and prevent abuse.

Rate limits are usually measured in:

Requests per second — hard ceiling for burst protection
Requests per minute/hour/day — sustained throughput quotas
Concurrent connections — simultaneous open connections
Points per window — weighted limits where expensive operations cost more points than cheap ones

Reading Rate Limit Headers

While not standardized until recently (IETF draft), most APIs return rate limit state in response headers:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1710000000
Retry-After: 47

Header	Meaning
`X-RateLimit-Limit`	Total requests allowed in the window
`X-RateLimit-Remaining`	Requests left in the current window
`X-RateLimit-Reset`	Unix timestamp when the window resets
`Retry-After`	Seconds (or date) until the client may retry

The Retry-After header is the most important — always respect it. Ignoring it and retrying immediately will extend your ban window on many APIs.

Common Causes

Unthrottled loops: a for loop making API calls with no delay between iterations is the single most common cause of 429 errors.

# Bad — fires 10,000 requests with no throttle
for user_id in user_ids:
    fetch_user(user_id)

Retry storms: multiple threads or processes retrying simultaneously after a 429 amplifies the problem.

Missing backoff: retrying immediately after a 429 instead of waiting the Retry-After interval.

Token sharing: multiple services sharing one API key. Rate limits are per-key, so a noisy service consumes quota for all others.

Client-Side Solutions

1. Respect Retry-After:

import time, requests

def api_call_with_backoff(url: str) -> dict:
    for attempt in range(5):
        response = requests.get(url)
        if response.status_code == 429:
            retry_after = int(response.headers.get('Retry-After', 60))
            time.sleep(retry_after)
            continue
        response.raise_for_status()
        return response.json()
    raise Exception('Rate limit exceeded after 5 attempts')

2. Token bucket / leaky bucket throttling:

import time

class RateLimiter:
    def __init__(self, calls_per_second: float) -> None:
        self.min_interval = 1.0 / calls_per_second
        self.last_call = 0.0

    def wait(self) -> None:
        elapsed = time.monotonic() - self.last_call
        if elapsed < self.min_interval:
            time.sleep(self.min_interval - elapsed)
        self.last_call = time.monotonic()

3. Exponential backoff with jitter — when Retry-After is absent:

import random

def backoff(attempt: int) -> float:
    return min(2 ** attempt + random.uniform(0, 1), 60)

Jitter prevents the *thundering herd* problem when many clients retry simultaneously.

Server-Side Configuration

Nginx rate limiting:

limit_req_zone $binary_remote_addr zone=api:10m rate=100r/m;

location /api/ {
    limit_req zone=api burst=20 nodelay;
    limit_req_status 429;
}

Always set limit_req_status 429 — the default is 503, which is misleading (503 implies the server is down, not rate-limited).

Testing Rate Limits

# Fire 50 concurrent requests
seq 50 | xargs -P 50 -I{} curl -s -o /dev/null -w '%{http_code}\n' \
  https://api.example.com/endpoint

# Using Apache Bench
ab -n 200 -c 50 https://api.example.com/endpoint

Rate Limit Bypass Prevention

Clients may attempt to bypass limits by rotating IP addresses or API keys. Defenses include: rate limiting by user ID (not just IP), detecting abnormal patterns with anomaly detection, and requiring authenticated access for high-quota endpoints.