Why Cache API Responses?
Caching is the single highest-leverage performance optimization for HTTP APIs. A cached response costs zero server CPU, zero database queries, and delivers in single-digit milliseconds from a nearby CDN edge node. Even modest cache hit rates dramatically reduce origin load and improve tail latency.
Beyond raw speed, caching has strategic value:
- Cost reduction — fewer compute cycles and database reads
- Resilience — stale cache can serve traffic during origin outages
- Scalability — cache absorbs traffic spikes without horizontal scaling
Cache-Control Directives Deep Dive
The Cache-Control response header is the authoritative mechanism for instructing caches. It replaced the older Expires header and gives far more expressive control.
Freshness Directives
| Directive | Meaning |
|---|---|
| `max-age=N` | Cache for N seconds from the request time |
| `s-maxage=N` | Shared cache (CDN) TTL, overrides `max-age` for CDNs |
| `no-store` | Never store this response anywhere |
| `no-cache` | Store but revalidate with origin before serving |
| `private` | Only the user's browser may cache (not CDNs) |
| `public` | Any cache (browser, CDN, proxy) may store |
# Public API endpoint: CDN caches for 5 min, browser for 1 min
Cache-Control: public, s-maxage=300, max-age=60
# Authenticated endpoint: browser only, 0 CDN
Cache-Control: private, max-age=120
# Never cache (e.g. OTP endpoints)
Cache-Control: no-store
Revalidation Directives
no-cache is the most misunderstood directive — it does NOT mean "do not cache". It means: *store the response but always revalidate* with the origin before serving it. The origin can respond 304 Not Modified to confirm the cached copy is still fresh, saving bandwidth.
ETags for Conditional Requests
An ETag is an opaque identifier for a specific version of a resource. The server generates it (usually a hash of the content or a version timestamp) and sends it in the response:
HTTP/1.1 200 OK
ETag: "33a64df551425fcc55e4d42a148795d9f25f89d4"
Cache-Control: no-cache
On subsequent requests, the client sends the ETag back in If-None-Match:
GET /api/products/42 HTTP/1.1
If-None-Match: "33a64df551425fcc55e4d42a148795d9f25f89d4"
If the resource has not changed, the server responds with 304 Not Modified — no body, minimal bandwidth. ETags are especially powerful for resources that are expensive to compute but change infrequently.
Strong vs Weak ETags:
- Strong (
"abc123") — byte-for-byte identical - Weak (
W/"abc123") — semantically equivalent but possibly different bytes
Use strong ETags for range requests; weak ETags are fine for standard cache revalidation.
Stale-While-Revalidate
The stale-while-revalidate extension lets the cache serve a stale response immediately while fetching a fresh copy in the background:
Cache-Control: max-age=60, stale-while-revalidate=600
This means: *serve fresh for 60 seconds, then serve stale for up to 600 more seconds while revalidating asynchronously*. The user always gets a fast response; freshness catches up in the background. This pattern is ideal for content that is frequently read but tolerates brief staleness (e.g., product catalog, configuration endpoints).
Combined with stale-if-error=86400, you can continue serving stale responses for up to a day during origin outages.
CDN vs Application Caching
| Layer | Where | Best For |
|---|---|---|
| Browser cache | User's device | Repeat visits, assets |
| CDN / edge cache | ISP-adjacent PoP | Public API responses, static assets |
| Reverse proxy (Nginx/Varnish) | Your data center | Reducing app server load |
| Application cache (Redis) | In-process / sidecar | Expensive DB queries, sessions |
CDN caching applies to public responses and respects s-maxage. Application-level caching (Redis, Memcached) is your backstop for private or no-store responses that cannot be cached by intermediaries.
Cache Invalidation
The two hardest problems in computer science are naming things and cache invalidation. HTTP gives you three strategies:
1. TTL-based expiry — Set a short max-age and let stale entries expire naturally. Simple but trades freshness for simplicity.
2. Cache-busting (URL versioning) — Embed a version or content hash in the URL (/api/v2/..., /static/app.a3f4c.js). The old URL simply stops being requested. Ideal for immutable assets:
Cache-Control: public, max-age=31536000, immutable
3. Surrogate-Key / Cache-Tag purging — CDNs like Cloudflare and Fastly support tagging cached objects with logical keys. A single API call purges all objects with a given tag:
Surrogate-Key: product-42 category-electronics
When product 42 is updated, purge the product-42 tag instantly — no need to know individual cached URLs.
Measuring Cache Hit Rates
Cloudflare exposes CF-Cache-Status in response headers:
| Value | Meaning |
|---|---|
| `HIT` | Served from CDN cache |
| `MISS` | Fetched from origin, stored |
| `EXPIRED` | Stale, refetched from origin |
| `BYPASS` | Cache bypassed (e.g., `no-store`) |
| `DYNAMIC` | Not eligible for caching |
Target a hit rate above 80% for public endpoints. If you see mostly MISS or DYNAMIC, audit your Cache-Control headers — authenticated requests, cookies, and query strings often prevent caching without explicit configuration.
Nginx adds X-Cache-Status with similar semantics when using proxy_cache. Track these metrics in your observability platform alongside TTFB to confirm caching impact.