Introduction
Prometheus scrapes metrics from HTTP endpoints exposed by your application pods. To monitor custom metrics, your application must expose a /metrics endpoint in Prometheus text format, and Prometheus must be configured to discover and scrape that endpoint. The typical setup involves adding a metrics library to your application, annotating pods for auto-discovery, and configuring ServiceMonitor or Prometheus scrape annotations.
Step 1: Instrument Your Application
Add a Prometheus client library to expose custom metrics over HTTP.
Python (Flask)
1from flask import Flask
2from prometheus_client import Counter, Histogram, generate_latest, CONTENT_TYPE_LATEST
3
4app = Flask(__name__)
5
6# Define custom metrics
7REQUEST_COUNT = Counter(
8 'app_requests_total',
9 'Total requests',
10 ['method', 'endpoint', 'status']
11)
12
13REQUEST_LATENCY = Histogram(
14 'app_request_duration_seconds',
15 'Request latency in seconds',
16 ['endpoint']
17)
18
19@app.route('/api/orders', methods=['POST'])
20def create_order():
21 with REQUEST_LATENCY.labels(endpoint='/api/orders').time():
22 # ... process order ...
23 REQUEST_COUNT.labels(method='POST', endpoint='/api/orders', status='200').inc()
24 return '{"status": "created"}', 201
25
26@app.route('/metrics')
27def metrics():
28 return generate_latest(), 200, {'Content-Type': CONTENT_TYPE_LATEST}
Go
1package main
2
3import (
4 "net/http"
5 "github.com/prometheus/client_golang/prometheus"
6 "github.com/prometheus/client_golang/prometheus/promhttp"
7)
8
9var requestCount = prometheus.NewCounterVec(
10 prometheus.CounterOpts{
11 Name: "app_requests_total",
12 Help: "Total requests",
13 },
14 []string{"method", "endpoint", "status"},
15)
16
17func init() {
18 prometheus.MustRegister(requestCount)
19}
20
21func main() {
22 http.Handle("/metrics", promhttp.Handler())
23 http.HandleFunc("/api/orders", func(w http.ResponseWriter, r *http.Request) {
24 requestCount.WithLabelValues("POST", "/api/orders", "200").Inc()
25 w.WriteHeader(http.StatusCreated)
26 })
27 http.ListenAndServe(":8080", nil)
28}
Step 2: Expose the Metrics Port in Kubernetes
1# deployment.yaml
2apiVersion: apps/v1
3kind: Deployment
4metadata:
5 name: order-service
6 labels:
7 app: order-service
8spec:
9 replicas: 3
10 selector:
11 matchLabels:
12 app: order-service
13 template:
14 metadata:
15 labels:
16 app: order-service
17 annotations:
18 prometheus.io/scrape: "true"
19 prometheus.io/port: "8080"
20 prometheus.io/path: "/metrics"
21 spec:
22 containers:
23 - name: order-service
24 image: order-service:latest
25 ports:
26 - name: http
27 containerPort: 8080
28 - name: metrics
29 containerPort: 8080 # Same port if metrics served on same server
The prometheus.io/* annotations tell Prometheus to auto-discover and scrape this pod.
Option A: Annotations-Based Discovery (prometheus.yml)
1# prometheus.yml
2scrape_configs:
3 - job_name: 'kubernetes-pods'
4 kubernetes_sd_configs:
5 - role: pod
6 relabel_configs:
7 # Only scrape pods with prometheus.io/scrape=true
8 - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
9 action: keep
10 regex: true
11 # Use custom port from annotation
12 - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port]
13 action: replace
14 target_label: __address__
15 regex: (.+)
16 replacement: ${1}
17 # Use custom path from annotation
18 - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
19 action: replace
20 target_label: __metrics_path__
21 regex: (.+)
Option B: ServiceMonitor (Prometheus Operator)
If you use the Prometheus Operator (kube-prometheus-stack), create a ServiceMonitor:
1# service.yaml
2apiVersion: v1
3kind: Service
4metadata:
5 name: order-service
6 labels:
7 app: order-service
8spec:
9 selector:
10 app: order-service
11 ports:
12 - name: metrics
13 port: 8080
14 targetPort: 8080
15---
16# servicemonitor.yaml
17apiVersion: monitoring.coreos.com/v1
18kind: ServiceMonitor
19metadata:
20 name: order-service
21 labels:
22 release: prometheus # Must match Prometheus operator's serviceMonitorSelector
23spec:
24 selector:
25 matchLabels:
26 app: order-service
27 endpoints:
28 - port: metrics
29 path: /metrics
30 interval: 15s
Step 4: Install Prometheus with Helm
1# Add the Prometheus community Helm chart
2helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
3helm repo update
4
5# Install kube-prometheus-stack (includes Prometheus, Grafana, AlertManager)
6helm install prometheus prometheus-community/kube-prometheus-stack \
7 --namespace monitoring \
8 --create-namespace
9
10# Verify
11kubectl get pods -n monitoring
Step 5: Query Custom Metrics
Access the Prometheus UI and query your custom metrics:
1# Total requests per endpoint
2app_requests_total
3
4# Request rate per second over 5 minutes
5rate(app_requests_total[5m])
6
7# 95th percentile latency
8histogram_quantile(0.95, rate(app_request_duration_seconds_bucket[5m]))
9
10# Error rate
11sum(rate(app_requests_total{status=~"5.."}[5m]))
12/
13sum(rate(app_requests_total[5m]))
Step 6: Create Grafana Dashboards and Alerts
1# alerting rule
2apiVersion: monitoring.coreos.com/v1
3kind: PrometheusRule
4metadata:
5 name: order-service-alerts
6 labels:
7 release: prometheus
8spec:
9 groups:
10 - name: order-service
11 rules:
12 - alert: HighErrorRate
13 expr: |
14 sum(rate(app_requests_total{status=~"5.."}[5m]))
15 /
16 sum(rate(app_requests_total[5m]))
17 > 0.05
18 for: 5m
19 labels:
20 severity: critical
21 annotations:
22 summary: "High error rate on order-service"
23 description: "Error rate is above 5% for 5 minutes"
Common Pitfalls
Missing prometheus.io/scrape: "true" annotation: Without this annotation, Prometheus with annotation-based discovery will not scrape the pod. This is the most common reason custom metrics do not appear in Prometheus.
ServiceMonitor label mismatch: The ServiceMonitor's labels must match the Prometheus operator's serviceMonitorSelector. If the operator is configured to select release: prometheus, your ServiceMonitor must have that label. Check with kubectl get prometheus -o yaml.
Metrics endpoint returning wrong format: Prometheus expects the OpenMetrics/Prometheus text format. Returning JSON or other formats causes scrape failures. Use the official client libraries which handle formatting automatically.
High cardinality labels: Adding labels with many unique values (user IDs, request IDs, timestamps) creates millions of time series and can crash Prometheus. Keep label cardinality low — use buckets for histograms and aggregate at query time.
Scrape interval too aggressive: Scraping every 1-2 seconds generates massive amounts of data. The default 15-30 second interval is appropriate for most applications. Only decrease for truly real-time requirements.
Summary
Instrument your application with a Prometheus client library to expose a /metrics endpoint
Add prometheus.io/scrape, prometheus.io/port, and prometheus.io/path annotations to pod templates
Use ServiceMonitor with the Prometheus Operator for declarative scrape configuration
Install kube-prometheus-stack via Helm for a complete monitoring setup (Prometheus + Grafana + AlertManager)
Query custom metrics with PromQL: rate(), histogram_quantile(), and aggregation functions
Keep label cardinality low to avoid Prometheus performance issues