Performance & Optimization

HTTP Caching Strategy for APIs

A practical guide to HTTP caching for API developers — from Cache-Control directives and ETags to CDN strategies and cache invalidation patterns.

Why Cache API Responses?

Caching is the single highest-leverage performance optimization for HTTP APIs. A cached response costs zero server CPU, zero database queries, and delivers in single-digit milliseconds from a nearby CDN edge node. Even modest cache hit rates dramatically reduce origin load and improve tail latency.

Beyond raw speed, caching has strategic value:

  • Cost reduction — fewer compute cycles and database reads
  • Resilience — stale cache can serve traffic during origin outages
  • Scalability — cache absorbs traffic spikes without horizontal scaling

Cache-Control Directives Deep Dive

The Cache-Control response header is the authoritative mechanism for instructing caches. It replaced the older Expires header and gives far more expressive control.

Freshness Directives

DirectiveMeaning
`max-age=N`Cache for N seconds from the request time
`s-maxage=N`Shared cache (CDN) TTL, overrides `max-age` for CDNs
`no-store`Never store this response anywhere
`no-cache`Store but revalidate with origin before serving
`private`Only the user's browser may cache (not CDNs)
`public`Any cache (browser, CDN, proxy) may store
# Public API endpoint: CDN caches for 5 min, browser for 1 min
Cache-Control: public, s-maxage=300, max-age=60

# Authenticated endpoint: browser only, 0 CDN
Cache-Control: private, max-age=120

# Never cache (e.g. OTP endpoints)
Cache-Control: no-store

Revalidation Directives

no-cache is the most misunderstood directive — it does NOT mean "do not cache". It means: *store the response but always revalidate* with the origin before serving it. The origin can respond 304 Not Modified to confirm the cached copy is still fresh, saving bandwidth.

ETags for Conditional Requests

An ETag is an opaque identifier for a specific version of a resource. The server generates it (usually a hash of the content or a version timestamp) and sends it in the response:

HTTP/1.1 200 OK
ETag: "33a64df551425fcc55e4d42a148795d9f25f89d4"
Cache-Control: no-cache

On subsequent requests, the client sends the ETag back in If-None-Match:

GET /api/products/42 HTTP/1.1
If-None-Match: "33a64df551425fcc55e4d42a148795d9f25f89d4"

If the resource has not changed, the server responds with 304 Not Modified — no body, minimal bandwidth. ETags are especially powerful for resources that are expensive to compute but change infrequently.

Strong vs Weak ETags:

  • Strong ("abc123") — byte-for-byte identical
  • Weak (W/"abc123") — semantically equivalent but possibly different bytes

Use strong ETags for range requests; weak ETags are fine for standard cache revalidation.

Stale-While-Revalidate

The stale-while-revalidate extension lets the cache serve a stale response immediately while fetching a fresh copy in the background:

Cache-Control: max-age=60, stale-while-revalidate=600

This means: *serve fresh for 60 seconds, then serve stale for up to 600 more seconds while revalidating asynchronously*. The user always gets a fast response; freshness catches up in the background. This pattern is ideal for content that is frequently read but tolerates brief staleness (e.g., product catalog, configuration endpoints).

Combined with stale-if-error=86400, you can continue serving stale responses for up to a day during origin outages.

CDN vs Application Caching

LayerWhereBest For
Browser cacheUser's deviceRepeat visits, assets
CDN / edge cacheISP-adjacent PoPPublic API responses, static assets
Reverse proxy (Nginx/Varnish)Your data centerReducing app server load
Application cache (Redis)In-process / sidecarExpensive DB queries, sessions

CDN caching applies to public responses and respects s-maxage. Application-level caching (Redis, Memcached) is your backstop for private or no-store responses that cannot be cached by intermediaries.

Cache Invalidation

The two hardest problems in computer science are naming things and cache invalidation. HTTP gives you three strategies:

1. TTL-based expiry — Set a short max-age and let stale entries expire naturally. Simple but trades freshness for simplicity.

2. Cache-busting (URL versioning) — Embed a version or content hash in the URL (/api/v2/..., /static/app.a3f4c.js). The old URL simply stops being requested. Ideal for immutable assets:

Cache-Control: public, max-age=31536000, immutable

3. Surrogate-Key / Cache-Tag purging — CDNs like Cloudflare and Fastly support tagging cached objects with logical keys. A single API call purges all objects with a given tag:

Surrogate-Key: product-42 category-electronics

When product 42 is updated, purge the product-42 tag instantly — no need to know individual cached URLs.

Measuring Cache Hit Rates

Cloudflare exposes CF-Cache-Status in response headers:

ValueMeaning
`HIT`Served from CDN cache
`MISS`Fetched from origin, stored
`EXPIRED`Stale, refetched from origin
`BYPASS`Cache bypassed (e.g., `no-store`)
`DYNAMIC`Not eligible for caching

Target a hit rate above 80% for public endpoints. If you see mostly MISS or DYNAMIC, audit your Cache-Control headers — authenticated requests, cookies, and query strings often prevent caching without explicit configuration.

Nginx adds X-Cache-Status with similar semantics when using proxy_cache. Track these metrics in your observability platform alongside TTFB to confirm caching impact.

संबंधित प्रोटोकॉल

संबंधित शब्दावली शब्द

इसमें और Performance & Optimization