What Triggers 429?
429 Too Many Requests is returned when a client exceeds the rate limit set by the server. Rate limits exist to protect server resources, ensure fair usage among clients, and prevent abuse.
Rate limits are usually measured in:
- Requests per second — hard ceiling for burst protection
- Requests per minute/hour/day — sustained throughput quotas
- Concurrent connections — simultaneous open connections
- Points per window — weighted limits where expensive operations cost more points than cheap ones
Reading Rate Limit Headers
While not standardized until recently (IETF draft), most APIs return rate limit state in response headers:
HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1710000000
Retry-After: 47
| Header | Meaning |
|---|---|
| `X-RateLimit-Limit` | Total requests allowed in the window |
| `X-RateLimit-Remaining` | Requests left in the current window |
| `X-RateLimit-Reset` | Unix timestamp when the window resets |
| `Retry-After` | Seconds (or date) until the client may retry |
The Retry-After header is the most important — always respect it. Ignoring it and retrying immediately will extend your ban window on many APIs.
Common Causes
Unthrottled loops: a for loop making API calls with no delay between iterations is the single most common cause of 429 errors.
# Bad — fires 10,000 requests with no throttle
for user_id in user_ids:
fetch_user(user_id)
Retry storms: multiple threads or processes retrying simultaneously after a 429 amplifies the problem.
Missing backoff: retrying immediately after a 429 instead of waiting the Retry-After interval.
Token sharing: multiple services sharing one API key. Rate limits are per-key, so a noisy service consumes quota for all others.
Client-Side Solutions
1. Respect Retry-After:
import time, requests
def api_call_with_backoff(url: str) -> dict:
for attempt in range(5):
response = requests.get(url)
if response.status_code == 429:
retry_after = int(response.headers.get('Retry-After', 60))
time.sleep(retry_after)
continue
response.raise_for_status()
return response.json()
raise Exception('Rate limit exceeded after 5 attempts')
2. Token bucket / leaky bucket throttling:
import time
class RateLimiter:
def __init__(self, calls_per_second: float) -> None:
self.min_interval = 1.0 / calls_per_second
self.last_call = 0.0
def wait(self) -> None:
elapsed = time.monotonic() - self.last_call
if elapsed < self.min_interval:
time.sleep(self.min_interval - elapsed)
self.last_call = time.monotonic()
3. Exponential backoff with jitter — when Retry-After is absent:
import random
def backoff(attempt: int) -> float:
return min(2 ** attempt + random.uniform(0, 1), 60)
Jitter prevents the *thundering herd* problem when many clients retry simultaneously.
Server-Side Configuration
Nginx rate limiting:
limit_req_zone $binary_remote_addr zone=api:10m rate=100r/m;
location /api/ {
limit_req zone=api burst=20 nodelay;
limit_req_status 429;
}
Always set limit_req_status 429 — the default is 503, which is misleading (503 implies the server is down, not rate-limited).
Testing Rate Limits
# Fire 50 concurrent requests
seq 50 | xargs -P 50 -I{} curl -s -o /dev/null -w '%{http_code}\n' \
https://api.example.com/endpoint
# Using Apache Bench
ab -n 200 -c 50 https://api.example.com/endpoint
Rate Limit Bypass Prevention
Clients may attempt to bypass limits by rotating IP addresses or API keys. Defenses include: rate limiting by user ID (not just IP), detecting abnormal patterns with anomaly detection, and requiring authenticated access for high-quota endpoints.