gRPC Error Handling Best Practices

gRPC Status vs HTTP Status

gRPC uses its own status code system — not HTTP codes. While gRPC can transport over HTTP/2, the application layer uses google.rpc.Code values in the trailer metadata. HTTP/2 status codes (like 200 OK) only indicate that the transport layer succeeded; the actual RPC result is in the gRPC status.

This is a common source of confusion when building gRPC-HTTP transcoding layers or REST-to-gRPC gateways.

The 17 gRPC Status Codes

Code	Name	When to Use
0	`OK`	Successful
1	`CANCELLED`	Client cancelled the request
2	`UNKNOWN`	Unclassified server-side error
3	`INVALID_ARGUMENT`	Client sent bad data (like 400)
4	`DEADLINE_EXCEEDED`	Timeout expired (like 504)
5	`NOT_FOUND`	Resource not found (like 404)
6	`ALREADY_EXISTS`	Conflict (like 409)
7	`PERMISSION_DENIED`	Authorized but not allowed (like 403)
8	`RESOURCE_EXHAUSTED`	Rate limited (like 429)
9	`FAILED_PRECONDITION`	System not in required state
10	`ABORTED`	Concurrency conflict (retry)
11	`OUT_OF_RANGE`	Iterator past valid range
12	`UNIMPLEMENTED`	Method not supported (like 501)
13	`INTERNAL`	Server bug (like 500)
14	`UNAVAILABLE`	Server temporarily unavailable (like 503)
15	`DATA_LOSS`	Unrecoverable data corruption
16	`UNAUTHENTICATED`	No valid credentials (like 401)

Return status codes in Python, Go, and Java:

# Python (grpcio)
import grpc

class UserService(pb2_grpc.UserServiceServicer):
    def GetUser(self, request, context):
        user = db.find(request.user_id)
        if not user:
            context.abort(
                grpc.StatusCode.NOT_FOUND,
                f'User {request.user_id} not found',
            )
        return pb2.UserResponse(user=user)

// Go (google.golang.org/grpc/status)
import (
    "google.golang.org/grpc/codes"
    "google.golang.org/grpc/status"
)

func (s *UserServer) GetUser(ctx context.Context, req *pb.GetUserRequest) (*pb.User, error) {
    user, err := s.db.Find(req.UserId)
    if err != nil {
        return nil, status.Errorf(codes.NotFound, "user %d not found", req.UserId)
    }
    return user, nil
}

Rich Error Details (google.rpc.Status)

The base gRPC status only carries a code and a message string. For richer error information, attach structured details via google.rpc.Status proto:

from google.rpc import status_pb2, error_details_pb2
from grpc_status import rpc_status

def GetUser(self, request, context):
    # Build rich error with field violations
    detail = error_details_pb2.BadRequest()
    violation = detail.field_violations.add()
    violation.field = 'user_id'
    violation.description = 'user_id must be positive'

    rich_status = status_pb2.Status(
        code=grpc.StatusCode.INVALID_ARGUMENT.value[0],
        message='Invalid request',
        details=[detail],
    )
    context.abort_with_status(rpc_status.to_status(rich_status))

Common detail types from google.rpc.error_details:

BadRequest — field-level validation errors
RetryInfo — tells client when to retry
QuotaFailure — which quota was exceeded
ErrorInfo — machine-readable error reason and domain
RequestInfo — request ID for support tracing

Error Propagation in Service Chains

In a microservice chain, propagate gRPC errors faithfully rather than wrapping them in generic INTERNAL errors:

func (s *OrderService) PlaceOrder(ctx context.Context, req *pb.OrderRequest) (*pb.Order, error) {
    // Call upstream inventory service
    _, err := s.inventoryClient.Reserve(ctx, &inventorypb.ReserveRequest{
        ItemId: req.ItemId,
        Qty:    req.Quantity,
    })
    if err != nil {
        // Propagate upstream gRPC status directly — don't wrap in INTERNAL
        return nil, err
    }
    order := s.db.CreateOrder(req)
    return order, nil
}

Deadline Propagation

gRPC deadlines propagate through context automatically. Always pass ctx to downstream calls to ensure the overall request budget is respected:

func (s *Gateway) HandleRequest(ctx context.Context, req *pb.Request) (*pb.Response, error) {
    // ctx carries the original deadline — pass it through
    userRes, err := s.userService.Get(ctx, &userpb.GetRequest{Id: req.UserId})
    if err != nil {
        if status.Code(err) == codes.DeadlineExceeded {
            // Log and propagate — do not restart the deadline
            log.Warn("upstream deadline exceeded")
        }
        return nil, err
    }
    return &pb.Response{User: userRes.User}, nil
}

Retry Policies in gRPC

Configure retry policies in the service config JSON (applied client-side):

{
  "methodConfig": [{
    "name": [{"service": "UserService"}],
    "retryPolicy": {
      "maxAttempts": 3,
      "initialBackoff": "0.5s",
      "maxBackoff": "5s",
      "backoffMultiplier": 2.0,
      "retryableStatusCodes": ["UNAVAILABLE", "RESOURCE_EXHAUSTED"]
    }
  }]
}

Only retry on transient codes: UNAVAILABLE, RESOURCE_EXHAUSTED. Never retry INVALID_ARGUMENT, NOT_FOUND, or PERMISSION_DENIED — these are deterministic failures that will not improve on retry.