Connection Pooling for HTTP Clients: Best Practices and Pitfalls

Why Connection Pooling Matters

Every new TCP+TLS connection to an HTTPS endpoint has a measurable overhead:

DNS resolution:      ~20-100ms
TCP handshake:       ~30-60ms (1 RTT)
TLS 1.3 handshake:   ~30-60ms (1 RTT, resumable sessions)
─────────────────────────────
Total cold start:    ~80-220ms before first byte

Connection pooling reuses established TCP+TLS connections for multiple HTTP requests, eliminating this overhead for all requests after the first.

HTTP/1.1 pooling: maintains multiple parallel connections per host (typically 6), because each connection can only serve one request at a time.

HTTP/2 pooling: a single connection can multiplex hundreds of concurrent requests via streams. Pool size for HTTP/2 is typically 1-2 connections per host.

Pool Configuration

Maximum Connections

# Python requests — HTTPAdapter controls pool settings
import requests
from requests.adapters import HTTPAdapter

session = requests.Session()
adapter = HTTPAdapter(
    pool_connections=10,   # Number of distinct host pools
    pool_maxsize=20,       # Max connections per pool (per host)
    max_retries=3,
)
session.mount('https://', adapter)
session.mount('http://', adapter)

// Go http.Client — Transport is the connection pool
import "net/http"

transport := &http.Transport{
    MaxIdleConns:        100,          // total idle connections across all hosts
    MaxIdleConnsPerHost: 10,           // idle connections per host
    MaxConnsPerHost:     20,           // hard cap per host
    IdleConnTimeout:     90 * time.Second,
    TLSHandshakeTimeout: 10 * time.Second,
}
client := &http.Client{Transport: transport, Timeout: 30 * time.Second}

Idle Timeout

The idle timeout closes connections that have not been used for a specified duration. Set this shorter than the server's keepalive timeout — otherwise the server closes idle connections while your pool still holds references to them:

# AWS ALB default idle timeout: 60 seconds
# Set client idle timeout to 55 seconds to be safe
adapter = HTTPAdapter(pool_maxsize=20)
# requests does not expose idle_timeout directly — use httpx instead:

import httpx
client = httpx.Client(
    limits=httpx.Limits(
        max_keepalive_connections=20,
        max_connections=100,
        keepalive_expiry=55.0,  # Close idle connections after 55s
    )
)

Connection TTL

Even active connections should be periodically recycled to pick up DNS changes and rotate load balancer targets. Set a maximum connection age (TTL):

// Go: close connections older than 5 minutes regardless of activity
transport := &http.Transport{
    MaxIdleConnsPerHost: 10,
    // Connections are not reused after this duration from creation
    // (requires a custom DialContext with tracking)
}

DNS and Pool Refresh

Long-lived connection pools are vulnerable to stale DNS: when a service scales horizontally or fails over to a new IP, your pool still holds connections to the old IPs.

# Java: HttpClient caches DNS results indefinitely by default!
# Set networkaddress.cache.ttl in jvm.security or use custom resolver:
import java.security.Security;
Security.setProperty("networkaddress.cache.ttl", "60");  // 60 seconds
Security.setProperty("networkaddress.cache.negative.ttl", "5");

// Go: DNS resolution is per-connection (no DNS caching in stdlib) — safe by default
// But if using a custom resolver, configure TTL alignment:
resolver := &net.Resolver{
    PreferGo: true,  // Use Go's pure-Go resolver
}

Best practices for DNS in connection pools:

Set idle timeout shorter than your upstream TTL — recycled connections will re-resolve DNS
Match your pool's idle timeout to your service's DNS TTL (typically 30-60s)
For Kubernetes services, DNS TTL is typically 30s — set idle timeout to 25s

Connection Leaks

Connection leaks occur when HTTP response bodies are not fully consumed or closed. In pooled clients, the connection cannot be returned to the pool until the response body is drained.

# BAD: response body not consumed — connection stays checked out
response = session.get('https://api.example.com/data')
if response.status_code == 200:
    data = response.json()
# Missing: response body might not be fully read if json() fails

# GOOD: always use context manager
with session.get('https://api.example.com/data', stream=True) as response:
    response.raise_for_status()
    data = response.json()
# Context manager drains and closes the connection properly

// Go: MUST read and close the body to return connection to pool
resp, err := client.Get("https://api.example.com/data")
if err != nil {
    return err
}
defer resp.Body.Close()                      // Always close
defer io.Copy(io.Discard, resp.Body)         // Always drain

Detecting Leaks

Monitor your pool metrics in production:

# httpx: access pool stats
import httpx

client = httpx.Client(limits=httpx.Limits(max_connections=100))

# Log pool utilization periodically
# If connections_in_use == max_connections and requests are queuing,
# you have a leak or undersized pool

Language-Specific Defaults

Language/Library	Default Pool Size	Idle Timeout	Notes
Python requests	10 per host	Never	Configure HTTPAdapter
Python httpx	100 total	5s	Production-ready defaults
Go http.Client	100 total, 2/host idle	90s	Safe DNS behavior
Node.js (http)	Infinity	5s	Set `maxSockets`
Java HttpClient	Implementation-dependent	20min	Override DNS TTL

Key Takeaways

Reusing connections eliminates 80-220ms of TCP+TLS overhead per request
Set idle timeout shorter than the server's keepalive timeout to avoid ECONNRESET errors
Always drain and close HTTP response bodies — especially on errors — to prevent pool exhaustion
In Java, explicitly set networkaddress.cache.ttl to prevent stale DNS in long-lived pools
Monitor pool utilization (connections_in_use / max_connections) in production