Real-Time Protocols

gRPC Streaming vs REST Polling: Real-Time Data Delivery

When to use gRPC server streaming or bidirectional streaming vs REST polling or webhooks for delivering real-time updates in microservice architectures.

The Real-Time Spectrum

There is no single 'real-time' solution — there is a spectrum of approaches, each with different latency, complexity, and infrastructure requirements:

ApproachLatencyClient complexityServer complexity
Short pollingHigh (interval)LowLow
Long pollingMediumMediumMedium
SSELowLowLow
WebSocketLowMediumMedium
gRPC server streamingLowLow (generated code)Medium
gRPC bidirectionalVery lowMediumHigh

Short and Long Polling

Short polling — the client sends a request every N seconds. Simple, but wasteful: most responses contain no new data. At 1-second poll intervals with 100K clients, you serve 100K requests/second of mostly empty responses.

Long polling — the server holds the request open until new data arrives, then responds. The client immediately reconnects. Latency matches the event itself (sub-second), but each 'event' requires a full HTTP round trip:

Client                         Server
  |--- GET /events?after=42 --->|  (server holds open)
  |                             |  ... data arrives ...
  |<--- 200 {event: 43} --------|  (immediately respond)
  |--- GET /events?after=43 --->|  (client immediately reconnects)

gRPC Server Streaming

gRPC server streaming is defined in a .proto file with the stream keyword on the response:

syntax = "proto3";

service MarketData {
  // Unary: client sends one request, server sends one response
  rpc GetQuote (QuoteRequest) returns (Quote);

  // Server streaming: client sends one request, server sends many
  rpc SubscribeQuotes (SubscribeRequest) returns (stream Quote);

  // Bidirectional: both sides stream
  rpc Trade (stream TradeOrder) returns (stream TradeConfirmation);
}

message SubscribeRequest { repeated string symbols = 1; }
message Quote {
  string symbol = 1;
  double bid = 2;
  double ask = 3;
  int64 timestamp_ms = 4;
}

The server-side implementation in Python (grpcio):

import grpc
import market_data_pb2
import market_data_pb2_grpc
from collections.abc import Iterator

class MarketDataServicer(market_data_pb2_grpc.MarketDataServicer):
    def SubscribeQuotes(
        self,
        request: market_data_pb2.SubscribeRequest,
        context: grpc.ServicerContext,
    ) -> Iterator[market_data_pb2.Quote]:
        for symbol in request.symbols:
            subscribe_to_feed(symbol)

        try:
            while not context.is_active() is False:
                for symbol in request.symbols:
                    quote = get_latest_quote(symbol)
                    yield market_data_pb2.Quote(
                        symbol=symbol,
                        bid=quote.bid,
                        ask=quote.ask,
                        timestamp_ms=quote.timestamp_ms,
                    )
                time.sleep(0.1)  # 10Hz updates
        except grpc.RpcError:
            pass  # Client disconnected

Flow Control and Deadlines

gRPC's HTTP/2 foundation provides built-in flow control. If the client cannot consume messages fast enough, the server-side yield will block — preventing unbounded memory growth from buffered messages.

Deadlines propagate through the call chain. If a client sets a 5-second deadline on a streaming call, cancellation propagates to all downstream RPCs the server makes, enabling clean resource cleanup:

# Client with deadline and metadata
with grpc.insecure_channel('localhost:50051') as channel:
    stub = market_data_pb2_grpc.MarketDataStub(channel)
    request = market_data_pb2.SubscribeRequest(symbols=['BTC', 'ETH'])
    # Stream for 30 seconds, then deadline fires
    for quote in stub.SubscribeQuotes(request, timeout=30):
        print(f'{quote.symbol}: {quote.bid}/{quote.ask}')

gRPC Bidirectional Streaming

Bidirectional streaming enables both sides to send messages independently over the same HTTP/2 stream. This is ideal for:

Collaborative editing — clients stream cursor positions and operations; server streams merged state back.

Interactive AI inference — client streams audio chunks; server streams back transcription fragments as they are recognized.

Real-time sync — device telemetry streams up while configuration changes stream down on the same connection.

# Bidirectional streaming client
def generate_orders():
    for order in order_queue:
        yield market_data_pb2.TradeOrder(
            symbol=order.symbol,
            quantity=order.quantity,
            side=order.side,
        )

with grpc.insecure_channel('trading:50051') as channel:
    stub = market_data_pb2_grpc.MarketDataStub(channel)
    for confirmation in stub.Trade(generate_orders()):
        print(f'Order {confirmation.order_id}: {confirmation.status}')

REST Alternatives

Webhooks — Push Without Polling

For service-to-service communication, webhooks invert the connection: instead of the consumer polling the producer, the producer HTTP POSTs to the consumer when events occur. This eliminates polling entirely:

Producer                      Consumer
  |--- POST /webhooks/payment -->|  (event fires)
  |<--- 200 OK ------------------|  (acknowledge)

Webhooks work well for low-frequency, high-importance events (payment confirmed, order shipped). They fail for high-frequency data (sub-second market data) because each event requires a new HTTP connection.

Conditional GET with ETag/304

For resources that change infrequently, conditional GET eliminates response bodies on cache hits:

GET /api/config HTTP/1.1
If-None-Match: "v42"

HTTP/1.1 304 Not Modified  (no body — config unchanged)

This is not truly real-time, but it reduces bandwidth from polling to near-zero when data is stable.

SSE for Browser Clients

When your consumer is a browser and data only flows server → client, SSE beats gRPC: browsers have native EventSource support, no code generation is required, and the connection survives through HTTP/2 proxies transparently.

gRPC-Web (the browser-compatible gRPC variant) requires a proxy (Envoy or grpc-web server) to translate between browser HTTP/1.1 and gRPC's HTTP/2 framing. This adds operational complexity that SSE avoids.

Architecture Decision Guide

Use gRPC streaming when:

  • Both client and server are non-browser services (microservices, native apps)
  • You need strongly-typed streaming with protobuf schemas
  • Latency must be sub-100ms and data volume is high
  • You need deadline propagation across a call chain
  • Bidirectional streaming is required

Use SSE when:

  • The consumer is a browser
  • Data flows only server → client
  • You want zero client-side code generation
  • Built-in reconnection with Last-Event-ID matters

Use REST polling when:

  • Events are infrequent (once per minute or less)
  • Infrastructure does not support persistent connections (serverless)
  • Simplicity outweighs latency requirements

Use webhooks when:

  • You are integrating with third-party services that push events
  • Events are asynchronous and the consumer controls its own endpoint
  • You need delivery guarantees with retry logic on the producer side

Related Protocols

Related Glossary Terms

More in Real-Time Protocols