kube-prometheus-stack issue scraping metrics

Kube-prometheus-stack

Metrics scraping

Kubernetes

Monitoring

Troubleshooting

kube-prometheus-stack issue scraping metrics

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

When kube-prometheus-stack is not scraping metrics, the problem is usually not Prometheus “in general.” It is almost always a mismatch between service discovery, target labels, endpoint configuration, or network access. The fastest way to debug it is to follow the path Prometheus uses to discover a target and verify each link in that chain.

Start with the Discovery Objects

In kube-prometheus-stack, Prometheus usually discovers targets through ServiceMonitor, PodMonitor, or Probe resources created by the Prometheus Operator. If the monitor resource does not match the intended service or pod, Prometheus never even gets a target to scrape.

A common ServiceMonitor looks like this:

yaml

1apiVersion: monitoring.coreos.com/v1
2kind: ServiceMonitor
3metadata:
4  name: my-app
5  namespace: monitoring
6spec:
7  selector:
8    matchLabels:
9      app: my-app
10  namespaceSelector:
11    matchNames:
12      - app
13  endpoints:
14    - port: metrics
15      path: /metrics
16      interval: 30s

This only works if the target Service in namespace app has label app: my-app and exposes a port actually named metrics.

Verify the Service and Port Names

The target service must match what the monitor resource expects:

yaml

1apiVersion: v1
2kind: Service
3metadata:
4  name: my-app
5  namespace: app
6  labels:
7    app: my-app
8spec:
9  selector:
10    app: my-app
11  ports:
12    - name: metrics
13      port: 8080
14      targetPort: 8080

One of the most common failures is using port: metrics in the ServiceMonitor when the service port has no name or uses a different name. In that case the selector may match, but the endpoint still cannot be resolved correctly.

Check Prometheus Targets First

Before changing manifests blindly, inspect Prometheus itself. The Targets page tells you whether:

the target was discovered
the target is down
the scrape path or scheme is wrong
the response failed because of TLS, auth, or timeout

If the target is missing from the Targets page entirely, focus on labels, namespaces, and monitor selection. If the target is present but down, focus on the actual endpoint.

Namespace and Selector Rules Matter

Prometheus in kube-prometheus-stack does not automatically scrape every ServiceMonitor in every namespace. The Prometheus resource can restrict which monitor objects it watches through namespace selectors and label selectors.

That means you may have a correct ServiceMonitor, but Prometheus still ignores it because the monitor object lives in the wrong namespace or lacks the expected release label.

This is a frequent Helm-related issue. For example, kube-prometheus-stack often expects a label tied to the Helm release. If the monitor was created manually without the matching label, Prometheus may never select it.

Verify Reachability from the Prometheus Pod

If the target is discovered but down, test the endpoint from inside the cluster:

bash

kubectl -n monitoring exec deploy/prometheus-kube-prometheus-prometheus -- \
  wget -qO- http://my-app.app.svc:8080/metrics

If that command fails, Prometheus is not the root issue. The service, endpoint, network policy, or application metrics endpoint itself is broken or unreachable.

This check is especially useful when:

a network policy blocks Prometheus
the metrics endpoint is on a different port than expected
the app serves metrics under a custom path
TLS or auth settings are incomplete

Match the Endpoint Details Exactly

Even when discovery works, scrape configuration still needs to match reality. Pay attention to:

'path'
'scheme'
'port'
TLS settings
bearer token or basic auth configuration

An app exposing metrics at /actuator/prometheus will never answer a default /metrics scrape correctly. A service using HTTPS will fail if the monitor says HTTP. Prometheus errors usually become obvious once you compare the endpoint config with the application’s real metrics URL.

Common Pitfalls

The most common mistake is thinking a ServiceMonitor selects pods directly. It normally selects services, so service labels and port names matter just as much as pod labels.

Another pitfall is forgetting namespace and label restrictions on the Prometheus resource itself. A perfectly valid monitor object can still be ignored if Prometheus is not configured to watch it.

It is also easy to debug only YAML and never test the metrics endpoint from inside the cluster. If the endpoint is not reachable from the Prometheus pod, no manifest tweak will fix scraping.

Finally, do not assume the standard path is always /metrics. Many applications expose metrics at custom paths, especially Spring Boot or other framework-specific setups.

Summary

Start by checking whether the target is discovered at all in the Prometheus Targets page.
Ensure ServiceMonitor labels, namespaces, and port names match the actual service.
Remember that Prometheus may filter monitor resources by namespace or release label.
Test the endpoint from inside the Prometheus pod to separate discovery problems from reachability problems.
Match the real metrics path, scheme, and auth settings exactly instead of relying on defaults.