Observability — Explained with Examples
Observability is the ability to understand a system’s internal state by examining its external outputs. In practice, observability is built on three pillars: metrics (quantitative measurements over time), logs (timestamped records of discrete events), and traces (end-to-end request flows across distributed services).
A system is observable when you can answer novel questions about its behavior without shipping new code. This goes beyond traditional monitoring, which typically alerts on known failure modes. Observability lets you explore unknowns — “why is latency spiking for Android users in Europe?” — by drilling into correlated metrics, logs, and traces.
Real-world analogy. Observability is the dashboard of a modern aircraft. You don’t just have a “check engine” light (monitoring). You have real-time altitude, fuel flow, engine temperature, GPS position, and a black box recording every switch flip. When something feels off, you have the data to diagnose the root cause.
Example query (PromQL):
# 99th percentile request latency over the last 5 minutes
histogram_quantile(0.99,
rate(http_request_duration_seconds_bucket[5m])
)Related terms: Prometheus, SLA, SLO, SLI, Chaos Engineering, Microservices, Orchestration
Related tutorial: Prometheus & Grafana Setup
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro