GKE
Kubernetes
Node Size
Cluster Management
Google Cloud

Selecting a node size for a GKE kubernetes cluster

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Choosing a node size for GKE is really a scheduling and cost problem, not just a VM-picking problem. You want nodes that fit your pods efficiently, leave room for system overhead, and fail gracefully when a node disappears. A good node size comes from workload data, not intuition.

Start with Pod Requests

The most useful inputs are:

  • CPU request per pod
  • memory request per pod
  • ephemeral storage request per pod
  • expected replica count
  • per-node overhead from DaemonSets and system components

If these numbers are wrong, node sizing will also be wrong. In both GKE Standard and Autopilot, resource requests drive placement decisions.

Here is a simple deployment example:

yaml
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4  name: api
5spec:
6  replicas: 6
7  template:
8    spec:
9      containers:
10        - name: api
11          image: us-docker.pkg.dev/example/api:latest
12          resources:
13            requests:
14              cpu: "500m"
15              memory: "1Gi"
16            limits:
17              cpu: "1"
18              memory: "2Gi"

That tells you much more about the right node size than a generic "small app" label ever will.

Match the Machine Family to the Workload

The exact machine families available depend on region and product support, so the stable way to reason about node size is by workload shape:

Workload ShapeWhat to FavorWhy
General web and API servicesGeneral-purpose nodesBalanced CPU and memory
CPU-bound workersCompute-optimized nodesBetter CPU density
In-memory servicesMemory-optimized nodesMore RAM per vCPU
GPU or accelerator workloadsAccelerator-backed nodesSpecialized hardware support

Choose the category first, then choose the size within that category.

Few Large Nodes Versus Many Small Nodes

This trade-off matters a lot:

  • Larger nodes reduce per-node overhead.
  • Smaller nodes reduce blast radius during failures or upgrades.
  • Tiny nodes can waste resources on logging, monitoring, and CNI overhead.
  • Very large nodes can be harder to fill efficiently and can cause bigger disruption when one node is drained.

In practice, moderate-size nodes are often the safest starting point unless you have a clear reason to bias in one direction.

A Practical Sizing Workflow

Suppose each pod requests:

  • 500m CPU
  • 1Gi memory

and each node also has:

  • DaemonSet overhead
  • kube-system overhead
  • some reserved headroom for rolling updates and burst

The workflow is:

  1. Estimate how many pods you want per node.
  2. Add per-node overhead.
  3. Avoid planning to 100 percent utilization.
  4. Check whether losing one node would evict too much workload at once.

If one node failure would knock out a large share of your service, the node size is probably too large for that workload.

Node Pools Usually Beat One Global Node Size

Many clusters should not have one "best" node size. If you run mixed workloads, separate node pools are cleaner.

Examples:

  • API services on general-purpose nodes
  • memory-heavy workers on memory-optimized nodes
  • batch jobs on a cheaper autoscaled pool

This lets you tune autoscaling, labels, taints, and cost strategy independently.

Other Constraints People Forget

  • Pod density limits can matter before CPU or memory does.
  • Local storage and disk throughput can become bottlenecks.
  • DaemonSets consume resources on every node, so very small nodes can be inefficient.
  • Cluster autoscaler helps, but it does not fix bad requests or bad workload separation.

Common Pitfalls

  • Choosing based on VM specs without first measuring pod requests.
  • Packing nodes too tightly and leaving no room for upgrades or transient spikes.
  • Using one node pool for very different workloads.
  • Optimizing only for hourly price instead of total cluster efficiency.

Summary

  • Start from pod requests, not machine labels.
  • Match the machine family to the workload shape.
  • Balance fewer large nodes against more small nodes.
  • Use node pools when workloads differ materially.
  • Leave headroom for system overhead, scaling, and disruption.

Course illustration
Course illustration

All Rights Reserved.