Logs, Metrics, Traces: What Each One Is Actually For
May 9, 2026
Observability gets sold as three pillars because the three answer fundamentally different questions, and using the wrong one for a question is either useless or ruinously expensive.
Logs are the record of what happened. Each line is an event, usually with a timestamp, a severity, and a payload. The strength of logs is detail. When you need to know what specifically went wrong on this one request, on this one machine, at this one moment, logs are the only artifact rich enough to answer. A stack trace lives in logs. A SQL query that exploded lives in logs. The bytes of a malformed message live in logs. Logs are the place you go after you already know roughly where to look.
Metrics are aggregated numbers over time. Request rate. Error rate. p99 latency. CPU. Queue depth. A metric is cheap because it is pre-aggregated: you do not store every event, you store counters and histograms sampled at fixed intervals. That is why metrics power dashboards and alerts. They are how you notice that something changed. The headline use is "is anything wrong right now," and they are very good at that.
Traces are the end-to-end view of one request as it crosses services. A trace stitches together the spans that each service produced while handling the request, with parent and child relationships preserved. The unique thing traces give you is the time breakdown across the call graph. Metrics say "the API got slow." Traces say "the API got slow because the recommendation service spent 800 ms waiting on its Redis lookup, and Redis was queued behind a slow command from the search indexer."
The investigation flow that emerges is the same in every incident worth its name. A metric fires the alert. A trace narrows the slow or failing service. Logs from that service explain why.
The trap underneath all of this is cardinality. Metrics are cheap because the set of label values is small. The moment you put user_id or request_id or any unbounded identifier on a metric, every unique value spawns a new time series. A few hundred extra series is fine. A few million will melt your monitoring backend and your storage bill. The same data belongs in logs or in traces, where high-cardinality fields are normal and you pay per event rather than per series.
Logs are expensive at high volume. Metrics are cheap until cardinality explodes. Traces sit in the middle and almost always require sampling, since storing every span from every request is rarely worth the price. Pick the pillar that matches the question and keep the cardinality discipline. That is what separates a useful observability stack from an unaffordable one.
Metrics tell you something is wrong, traces tell you where, logs tell you why. The expensive mistake is treating any of them like the others, especially shoving high-cardinality data into metrics.
Originally posted on LinkedIn. View original.