Why Connection Pooling Matters
Every new TCP+TLS connection to an HTTPS endpoint has a measurable overhead:
DNS resolution: ~20-100ms
TCP handshake: ~30-60ms (1 RTT)
TLS 1.3 handshake: ~30-60ms (1 RTT, resumable sessions)
─────────────────────────────
Total cold start: ~80-220ms before first byte
Connection pooling reuses established TCP+TLS connections for multiple HTTP requests, eliminating this overhead for all requests after the first.
HTTP/1.1 pooling: maintains multiple parallel connections per host (typically 6), because each connection can only serve one request at a time.
HTTP/2 pooling: a single connection can multiplex hundreds of concurrent requests via streams. Pool size for HTTP/2 is typically 1-2 connections per host.
Pool Configuration
Maximum Connections
# Python requests — HTTPAdapter controls pool settings
import requests
from requests.adapters import HTTPAdapter
session = requests.Session()
adapter = HTTPAdapter(
pool_connections=10, # Number of distinct host pools
pool_maxsize=20, # Max connections per pool (per host)
max_retries=3,
)
session.mount('https://', adapter)
session.mount('http://', adapter)
// Go http.Client — Transport is the connection pool
import "net/http"
transport := &http.Transport{
MaxIdleConns: 100, // total idle connections across all hosts
MaxIdleConnsPerHost: 10, // idle connections per host
MaxConnsPerHost: 20, // hard cap per host
IdleConnTimeout: 90 * time.Second,
TLSHandshakeTimeout: 10 * time.Second,
}
client := &http.Client{Transport: transport, Timeout: 30 * time.Second}
Idle Timeout
The idle timeout closes connections that have not been used for a specified duration. Set this shorter than the server's keepalive timeout — otherwise the server closes idle connections while your pool still holds references to them:
# AWS ALB default idle timeout: 60 seconds
# Set client idle timeout to 55 seconds to be safe
adapter = HTTPAdapter(pool_maxsize=20)
# requests does not expose idle_timeout directly — use httpx instead:
import httpx
client = httpx.Client(
limits=httpx.Limits(
max_keepalive_connections=20,
max_connections=100,
keepalive_expiry=55.0, # Close idle connections after 55s
)
)
Connection TTL
Even active connections should be periodically recycled to pick up DNS changes and rotate load balancer targets. Set a maximum connection age (TTL):
// Go: close connections older than 5 minutes regardless of activity
transport := &http.Transport{
MaxIdleConnsPerHost: 10,
// Connections are not reused after this duration from creation
// (requires a custom DialContext with tracking)
}
DNS and Pool Refresh
Long-lived connection pools are vulnerable to stale DNS: when a service scales horizontally or fails over to a new IP, your pool still holds connections to the old IPs.
# Java: HttpClient caches DNS results indefinitely by default!
# Set networkaddress.cache.ttl in jvm.security or use custom resolver:
import java.security.Security;
Security.setProperty("networkaddress.cache.ttl", "60"); // 60 seconds
Security.setProperty("networkaddress.cache.negative.ttl", "5");
// Go: DNS resolution is per-connection (no DNS caching in stdlib) — safe by default
// But if using a custom resolver, configure TTL alignment:
resolver := &net.Resolver{
PreferGo: true, // Use Go's pure-Go resolver
}
Best practices for DNS in connection pools:
- Set idle timeout shorter than your upstream TTL — recycled connections will re-resolve DNS
- Match your pool's idle timeout to your service's DNS TTL (typically 30-60s)
- For Kubernetes services, DNS TTL is typically 30s — set idle timeout to 25s
Connection Leaks
Connection leaks occur when HTTP response bodies are not fully consumed or closed. In pooled clients, the connection cannot be returned to the pool until the response body is drained.
# BAD: response body not consumed — connection stays checked out
response = session.get('https://api.example.com/data')
if response.status_code == 200:
data = response.json()
# Missing: response body might not be fully read if json() fails
# GOOD: always use context manager
with session.get('https://api.example.com/data', stream=True) as response:
response.raise_for_status()
data = response.json()
# Context manager drains and closes the connection properly
// Go: MUST read and close the body to return connection to pool
resp, err := client.Get("https://api.example.com/data")
if err != nil {
return err
}
defer resp.Body.Close() // Always close
defer io.Copy(io.Discard, resp.Body) // Always drain
Detecting Leaks
Monitor your pool metrics in production:
# httpx: access pool stats
import httpx
client = httpx.Client(limits=httpx.Limits(max_connections=100))
# Log pool utilization periodically
# If connections_in_use == max_connections and requests are queuing,
# you have a leak or undersized pool
Language-Specific Defaults
| Language/Library | Default Pool Size | Idle Timeout | Notes |
|---|---|---|---|
| Python requests | 10 per host | Never | Configure HTTPAdapter |
| Python httpx | 100 total | 5s | Production-ready defaults |
| Go http.Client | 100 total, 2/host idle | 90s | Safe DNS behavior |
| Node.js (http) | Infinity | 5s | Set `maxSockets` |
| Java HttpClient | Implementation-dependent | 20min | Override DNS TTL |
Key Takeaways
- Reusing connections eliminates 80-220ms of TCP+TLS overhead per request
- Set idle timeout shorter than the server's keepalive timeout to avoid ECONNRESET errors
- Always drain and close HTTP response bodies — especially on errors — to prevent pool exhaustion
- In Java, explicitly set
networkaddress.cache.ttlto prevent stale DNS in long-lived pools - Monitor pool utilization (
connections_in_use / max_connections) in production