Microservices Communication — REST, gRPC, Events, and Service Mesh Patterns
Microservices communication patterns define how independent services exchange data — synchronously via REST or gRPC, or asynchronously through events and message queues. Choosing the right pattern directly impacts system resilience, latency, scalability, and coupling. This guide covers the full spectrum from direct HTTP calls to service mesh sidecars.
Why Communication Patterns Matter
In a monolith, function calls are instant and reliable. In microservices, a call between services involves network I/O, serialization, potential failures, and latency. Netflix handles 2+ billion API edge requests daily through a sophisticated mix of synchronous and async communication. A badly chosen pattern creates a “distributed monolith” — services that are as coupled as a monolith but with network overhead. At DodaTech, microservices communication patterns orchestrate the backend for Doda Browser cloud sync and DodaZIP collaboration features.
Communication Architecture
graph TB
Client[Client] --> GW[API Gateway]
GW -->|REST| US[User Service]
GW -->|gRPC| OS[Order Service]
GW -->|Async Events| Queue[Message Queue]
Queue --> IS[Inventory Service]
Queue --> PS[Payment Service]
Queue --> NS[Notification Service]
subgraph Mesh[Service Mesh]
US ---|mTLS| SP1[Sidecar Proxy]
OS ---|mTLS| SP2[Sidecar Proxy]
SP1 <-->|mTLS| SP2
end
style GW fill:#e67e22,color:#fff
style Queue fill:#3498db,color:#fff
style Mesh fill:#9b59b6,color:#fff
Synchronous Protocols
REST over HTTP
The simplest approach. Services expose HTTP endpoints and call each other using standard HTTP methods. Benefits: simple, well-understood, works with any language. Drawbacks: coupled, blocking, high latency for chains.
# Order service calling payment service via REST
import httpx
import asyncio
async def process_order(order: dict) -> dict:
async with httpx.AsyncClient() as client:
# Call payment service synchronously
payment_resp = await client.post(
"http://payment-service:8001/charge",
json={"order_id": order["id"], "amount": order["total"]}
)
payment_resp.raise_for_status()
# Call inventory service
await client.post(
"http://inventory-service:8002/reserve",
json={"order_id": order["id"], "items": order["items"]}
)
return {"status": "confirmed", "payment_id": payment_resp.json()["id"]}
result = asyncio.run(process_order({"id": "ORD-123", "total": 99.99, "items": ["widget"]}))
print(f"Order result: {result}")gRPC
High-performance RPC using Protocol Buffers and HTTP/2. Supports streaming, bi-directional communication, and strict typing. Used for internal service-to-service communication where performance matters.
// order.proto
syntax = "proto3";
service OrderService {
rpc CreateOrder (CreateOrderRequest) returns (OrderResponse);
rpc StreamOrderUpdates (OrderFilter) returns (stream OrderEvent);
}
message CreateOrderRequest {
string user_id = 1;
repeated Item items = 2;
double total = 3;
}
message OrderResponse {
string order_id = 1;
string status = 2;
}# gRPC client
import grpc
import order_pb2
import order_pb2_grpc
channel = grpc.insecure_channel('order-service:50051')
stub = order_pb2_grpc.OrderServiceStub(channel)
response = stub.CreateOrder(order_pb2.CreateOrderRequest(
user_id="USR-42",
items=[order_pb2.Item(product_id="widget", qty=2, price=9.99)],
total=19.98
))
print(f"Order created: {response.order_id}, status: {response.status}")Asynchronous Communication
Events and messages decouple services. The producer publishes an event without knowing who consumes it. This enables independent scaling, failure isolation, and new subscribers without modifying producers.
# Event-driven communication with Kafka
from kafka import KafkaProducer
import json
producer = KafkaProducer(
bootstrap_servers=['kafka:9092'],
value_serializer=lambda v: json.dumps(v).encode()
)
# Publish event — no knowledge of consumers
producer.send('order_events', {
"type": "OrderPlaced",
"order_id": "ORD-123",
"user_id": "USR-42",
"total": 99.99
})
producer.flush()
print("OrderPlaced event published — inventory, payment, and notification will react independently")API Gateway Patterns
An API gateway is a single entry point for all clients. It handles routing, authentication, rate limiting, and response aggregation.
Gateway routing — clients call the gateway, which routes to internal services. The gateway knows the service topology; clients don’t.
Gateway aggregation — the gateway calls multiple services and combines responses. Useful for dashboards that need data from user + order + recommendation services.
# FastAPI gateway with aggregation
from fastapi import FastAPI
import httpx
app = FastAPI()
@app.get("/dashboard/{user_id}")
async def get_dashboard(user_id: str):
async with httpx.AsyncClient() as client:
# Fan-out to multiple services
user_resp = client.get(f"http://user-service/users/{user_id}")
order_resp = client.get(f"http://order-service/orders?user_id={user_id}")
rec_resp = client.get(f"http://recommendation-service/recs?user_id={user_id}")
user, orders, recs = await asyncio.gather(user_resp, order_resp, rec_resp)
return {
"user": user.json(),
"recent_orders": orders.json(),
"recommendations": recs.json(),
}Circuit Breaker with Retry and Exponential Backoff
When a downstream service fails, the circuit breaker trips and subsequent calls fail fast. After a cooldown period, the breaker allows a probe request. If it succeeds, the circuit closes.
import time
import random
class CircuitBreaker:
def __init__(self, threshold: int = 5, recovery: float = 30.0):
self.threshold = threshold
self.recovery = recovery
self.failures = 0
self.last_fail = 0.0
self.state = "closed"
async def call(self, func, *args, **kwargs):
if self.state == "open":
if time.time() - self.last_fail > self.recovery:
self.state = "half-open"
else:
raise Exception("Circuit breaker open — request rejected")
try:
result = await func(*args, **kwargs)
if self.state == "half-open":
self.state = "closed"
self.failures = 0
return result
except Exception as e:
self.failures += 1
self.last_fail = time.time()
if self.failures >= self.threshold:
self.state = "open"
raise e
# Retry with exponential backoff
async def call_with_retry(func, max_retries: int = 3):
for attempt in range(max_retries):
try:
return await func()
except Exception as e:
if attempt == max_retries - 1:
raise
wait = (2 ** attempt) + random.uniform(0, 1)
print(f"Attempt {attempt + 1} failed, retrying in {wait:.1f}s...")
await asyncio.sleep(wait)
cb = CircuitBreaker(threshold=3, recovery=10)
async def unreliable_service():
raise Exception("Downstream timeout")
# Test circuit breaker
for i in range(6):
try:
await cb.call(unreliable_service)
except Exception as e:
print(f"Attempt {i + 1}: {e}")Expected output:
Attempt 1: Downstream timeout
Attempt 2: Downstream timeout
Attempt 3: Downstream timeout
Attempt 4: Circuit breaker open — request rejected
Attempt 5: Circuit breaker open — request rejected
Attempt 6: Circuit breaker open — request rejectedService Mesh (Istio, Linkerd)
A service mesh offloads communication concerns (retries, circuit breaking, mTLS, observability) to a sidecar proxy. The application code only contains business logic.
Istio uses Envoy proxies injected alongside each service pod. It provides:
- mTLS — automatic encryption between all services
- Traffic splitting — canary deployments, A/B testing
- Observability — metrics, traces, logs per service call
- Circuit breaking — configurable via CRDs
# Istio VirtualService for traffic splitting
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: order-service
spec:
hosts:
- order-service
http:
- route:
- destination:
host: order-service
subset: v1
weight: 90
- destination:
host: order-service
subset: v2
weight: 10Common Errors
Synchronous call chains: Service A calls B calls C calls D. A failure anywhere in the chain cascades. If D is slow, all upstream services hold connections. Use async communication for long chains.
No timeout on HTTP calls: Default HTTP clients often wait 60-120 seconds. A slow downstream service can exhaust connection pools. Always set aggressive timeouts (e.g., 5 seconds).
Shared database across services: Multiple services reading/writing the same database creates tight coupling. Each service should own its data and expose an API.
Ignoring idempotency: Retries (from circuit breakers or timeouts) send duplicate requests. Every mutating endpoint should be idempotent or use idempotency keys.
No bulkhead pattern: A failure in one service shouldn’t exhaust resources needed by others. Use separate connection pools or thread pools per downstream service.
Over-reliance on service mesh: A service mesh adds latency (sidecar overhead) and operational complexity. Not every service needs it. Use it selectively for high-value resilience patterns.
No fallback in API gateway: If a downstream service fails, the gateway should return a degraded response (e.g., cached data, partial results) instead of failing completely.
Mini Project
Build a resilient API gateway with circuit breaker and retry:
import asyncio, time, random
class RetryCircuitBreaker:
def __init__(self, retries=3, fail_threshold=3, recovery=10):
self.retries = retries
self.fail_threshold = fail_threshold
self.recovery = recovery
self.failures = 0
self.state = "closed"
async def call(self, func):
if self.state == "open":
if time.time() - self.last_fail > self.recovery:
self.state = "half-open"
else:
raise Exception("Service unavailable (circuit open)")
for attempt in range(self.retries):
try:
result = await func()
if self.state == "half-open":
self.state = "closed"
self.failures = 0
return result
except Exception as e:
if attempt < self.retries - 1:
wait = 2 ** attempt + random.uniform(0, 0.5)
await asyncio.sleep(wait)
else:
self.failures += 1
self.last_fail = time.time()
if self.failures >= self.fail_threshold:
self.state = "open"
raise e
gateway = RetryCircuitBreaker()
async def test():
for i in range(8):
try:
result = await gateway.call(lambda: asyncio.sleep(0.1) or "Success")
print(f"Call {i+1}: {result}")
except Exception as e:
print(f"Call {i+1}: {e}")
asyncio.run(test())Cross-References
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro