Why Health Checks Matter
Load balancers route traffic to backend instances only when those instances are healthy. A misconfigured or missing health check means the load balancer will happily send requests to a server whose database connection pool is exhausted, whose disk is full, or that is in the middle of shutting down.
Health checks are your primary mechanism for automatic failure isolation — they let the load balancer detect a failed instance and stop sending traffic to it before your users notice.
Health Check Types
TCP Connect Check
The simplest check: can the load balancer open a TCP connection to the backend port?
Load Balancer → TCP SYN → Backend:8000
Backend → TCP SYN-ACK → (pass)
TCP checks verify the process is running and listening, but nothing more. A Django app that has exhausted its database connection pool will still pass a TCP check on port 8000. Use TCP checks only as a last resort when HTTP health endpoints are not possible.
HTTP Health Check
The standard production approach. The load balancer sends an HTTP GET request and expects a specific status code (almost always 200 OK):
GET /healthz HTTP/1.1
Host: 127.0.0.1
HTTP/1.1 200 OK
Content-Type: application/json
{"status": "ok"}
AWS ALB configuration:
{
"HealthCheckProtocol": "HTTP",
"HealthCheckPath": "/healthz",
"HealthCheckIntervalSeconds": 10,
"HealthyThresholdCount": 2,
"UnhealthyThresholdCount": 3,
"HealthCheckTimeoutSeconds": 5,
"Matcher": {"HttpCode": "200"}
}
gRPC Health Check
gRPC services implement the standard [gRPC Health Checking Protocol](https://github.com/grpc/grpc/blob/master/doc/health-checking.md). The load balancer calls grpc.health.v1.Health/Check:
# Python gRPC health service
from grpc_health.v1 import health, health_pb2, health_pb2_grpc
health_servicer = health.HealthServicer()
health_pb2_grpc.add_HealthServicer_to_server(health_servicer, server)
# Mark service as serving
health_servicer.set(
'mypackage.MyService',
health_pb2.HealthCheckResponse.SERVING
)
# Mark as not serving (triggers load balancer failover)
health_servicer.set(
'mypackage.MyService',
health_pb2.HealthCheckResponse.NOT_SERVING
)
AWS ALB supports gRPC health checks natively when the target group protocol version is gRPC.
Shallow vs Deep Health Checks
This is the most important design decision in health check architecture.
Shallow Check — `/healthz` (Process Alive)
Returns 200 if the application process is running and can handle requests. Does not check database connectivity, cache availability, or downstream services.
# Django view — shallow liveness check
from django.http import JsonResponse
def healthz(request):
"""Liveness probe — is this process alive?"""
return JsonResponse({"status": "ok"})
Use shallow checks for liveness probes — the load balancer needs to know whether to restart/replace this instance, not whether its dependencies are healthy.
Deep Check — `/readyz` (Dependencies Ready)
Verifies all dependencies are reachable before marking the instance ready to serve traffic:
# Django view — deep readiness check
import time
from django.db import connections
from django.core.cache import cache
from django.http import JsonResponse
def readyz(request):
"""Readiness probe — are all dependencies available?"""
checks = {}
status = 200
# Database check
try:
connections['default'].ensure_connection()
checks['database'] = 'ok'
except Exception as e:
checks['database'] = str(e)
status = 503
# Cache check
try:
cache.set('health_check', '1', timeout=5)
assert cache.get('health_check') == '1'
checks['cache'] = 'ok'
except Exception as e:
checks['cache'] = str(e)
status = 503
return JsonResponse(
{"status": "ok" if status == 200 else "degraded", "checks": checks},
status=status
)
Kubernetes Probe Separation
Kubernetes distinguishes three probe types — use the right endpoint for each:
containers:
- name: app
livenessProbe:
httpGet:
path: /healthz # Shallow — restart if dead
port: 8000
initialDelaySeconds: 10
periodSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /readyz # Deep — remove from LB if not ready
port: 8000
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 2
startupProbe:
httpGet:
path: /healthz # Allow slow startup before liveness kicks in
port: 8000
failureThreshold: 30
periodSeconds: 10
Never point livenessProbe at a deep check. If your database goes down, a liveness probe returning 503 causes Kubernetes to restart all pods simultaneously, making an outage catastrophic instead of partial.
Response Design
Health check responses should be consistent and machine-parseable:
// Healthy — HTTP 200
{
"status": "ok",
"version": "v2026.2.14.1",
"uptime_seconds": 3842,
"checks": {
"database": "ok",
"cache": "ok",
"queue": "ok"
}
}
// Unhealthy — HTTP 503
{
"status": "degraded",
"checks": {
"database": "connection refused: 127.0.0.1:5432",
"cache": "ok",
"queue": "ok"
}
}
Return 200 for healthy and 503 for unhealthy. Some teams use 200 with a degraded body — but load balancers check the status code, not the body, so returning 200 for an unhealthy instance defeats the purpose.
Graceful Shutdown Signaling
When a deployment or scaling event terminates an instance, the health check is your mechanism to signal the load balancer to stop sending new traffic:
import signal
import threading
# Global flag — set to True when shutdown begins
_shutting_down = threading.Event()
def handle_sigterm(signum, frame):
_shutting_down.set()
signal.signal(signal.SIGTERM, handle_sigterm)
def readyz(request):
if _shutting_down.is_set():
# Tell load balancer we are draining — stop sending new requests
return JsonResponse({"status": "shutting_down"}, status=503)
# ... normal readiness checks
The sequence during a graceful shutdown:
- SIGTERM received → set
_shutting_downflag /readyzstarts returning 503- Load balancer health check fails → stops routing new requests to this instance
- In-flight requests complete
- Process exits cleanly
AWS ALB deregistration delay (default 300s) provides a buffer — the instance continues receiving traffic already in-flight during this window even after deregistration begins.