Why
Every serious-Kubernetes-on-a-laptop article I have read either (a) assumes the reader has a workstation-class machine with 128 GB of RAM, or (b) recommends kind because the author thinks anything bigger is impractical. Both are wrong. The truth is that a real Kubernetes cluster fits in 16 GB of RAM, and a realistic multi-node one fits in 32 GB, and a highly available one fits in 48 GB. A 64 GB workstation runs the HA topology with headroom for the IDE, the browser, Slack, and a database client. A 128 GB workstation runs two HA clusters in parallel — one per client — with the same headroom.
The thesis of this part is: the math is simple, the bottlenecks are predictable, and the 64 GB sweet spot is real. We will measure each component, build the budget table, identify what scales sub-linearly (the things you can shrink) and what scales linearly (the things you cannot), and prove the topologies fit.
The components and their RAM cost
A real Kubernetes node — control plane or worker — runs a fixed set of system pods plus a variable set of workload pods. Here is what each one consumes on a freshly bootstrapped cluster, measured in steady state with kubectl top pod:
| Component | Where | RAM (steady state) | RAM (peak burst) |
|---|---|---|---|
| kube-apiserver | control plane | 350 MB | 800 MB |
| etcd | control plane | 200 MB | 500 MB |
| kube-controller-manager | control plane | 100 MB | 200 MB |
| kube-scheduler | control plane | 50 MB | 120 MB |
| kubelet | every node | 80 MB | 200 MB |
| kube-proxy | every node | 25 MB | 60 MB |
| CNI agent (Cilium) | every node | 250 MB | 500 MB |
| CNI agent (Flannel) | every node | 30 MB | 60 MB |
| CoreDNS | cluster (2 replicas) | 30 MB each | 80 MB each |
| metrics-server | cluster | 25 MB | 60 MB |
| Longhorn manager | every node | 150 MB | 300 MB |
| Longhorn instance manager | every node | 100 MB | 250 MB |
| node-exporter | every node | 30 MB | 60 MB |
| Operating system (Alpine + containerd) | every node | 200 MB | 400 MB |
Sum for a control plane node (Cilium + Longhorn): roughly 1.5 GB steady, 3 GB peak. Sum for a worker node (Cilium + Longhorn): roughly 750 MB steady, 1.5 GB peak.
These numbers are conservative and include the storage layer overhead. If you skip Longhorn and use local-path provisioner for dev, knock 250 MB off every node. If you use Flannel instead of Cilium, knock 220 MB off every node. The smallest realistic node — Alpine + containerd + kubelet + kube-proxy + Flannel + local-path + node-exporter — runs in about 400 MB.
The rest of the node's RAM is for workloads. This is where the sizing decisions actually matter, because the system overhead is essentially fixed and the workload footprint is what you control.
Topology budgets
We can now size each topology against a realistic workload (Acme's .NET API + Postgres + MinIO + observability):
k8s-single — 1 VM, ~16 GB
| Item | RAM | Notes |
|---|---|---|
| OS + containerd + kubelet + kube-proxy | 400 MB | Alpine, kubeadm node |
| etcd + apiserver + controller-manager + scheduler | 700 MB | Single-node control plane (taints removed so workloads schedule here too) |
| CNI (Flannel) | 60 MB | Default for single-node — cheaper than Cilium |
| CSI (local-path) | 50 MB | One-node storage; no replication, no Longhorn |
| CoreDNS x 2 | 60 MB | |
| metrics-server | 25 MB | |
| node-exporter | 30 MB | |
| System overhead total | ~1.3 GB | |
| Workloads | ~14 GB | The rest |
| VM size | 16 GB | Comfortable for solo-dev workloads |
The single-VM topology is the daily driver for solo iteration. It boots in about 2 minutes, uses 16 GB of RAM, and runs everything Acme's developers need to test their code locally. It does not run a real HA Postgres or a multi-node Longhorn, which is why it is not the topology for "I want to validate my production manifests" — that is what k8s-multi is for.
k8s-multi — 4 VMs, ~32 GB
| Node | Role | vCPUs | RAM | What runs here |
|---|---|---|---|---|
cp-1 |
control plane | 2 | 4 GB | etcd, apiserver, controller-manager, scheduler |
w-1 |
worker | 2 | 8 GB | Workloads, Longhorn, ingress, GitLab parts |
w-2 |
worker | 2 | 8 GB | Workloads, Longhorn, Postgres primary |
w-3 |
worker | 2 | 8 GB | Workloads, Longhorn, Postgres replica + MinIO |
| 8 | 28 GB |
Plus host OS overhead (~3-4 GB), the total is about 32 GB. This is the realistic topology — three workers means real rolling deploys, real pod anti-affinity, real Longhorn replication (which needs at least three nodes for the default replica count of three), real failure scenarios.
The four VMs run kubeadm (or k3s), with the control plane tainted so workloads only schedule on workers. This matches production semantics.
k8s-ha — 6+ VMs, ~48 GB
| Node | Role | vCPUs | RAM | What runs here |
|---|---|---|---|---|
cp-1 |
control plane | 2 | 4 GB | etcd, apiserver, controller-manager, scheduler |
cp-2 |
control plane | 2 | 4 GB | etcd, apiserver, controller-manager, scheduler |
cp-3 |
control plane | 2 | 4 GB | etcd, apiserver, controller-manager, scheduler |
lb |
load balancer | 1 | 1 GB | kube-vip or HAProxy fronting the API VIP |
w-1 |
worker | 2 | 8 GB | Workloads |
w-2 |
worker | 2 | 8 GB | Workloads |
w-3 |
worker | 2 | 8 GB | Workloads |
| 13 | 37 GB |
Plus host OS (~3-4 GB), the total is about 42 GB. Round up for safety to 48 GB. This is the HA Reference Architecture: three control plane nodes form an etcd quorum, the API server VIP is load-balanced across them by kube-vip (in-cluster) or HAProxy (out-of-cluster), and three workers carry the workloads.
A 64 GB workstation runs k8s-ha with 16 GB to spare for the host operating system, the IDE, the browser, the database client, Slack, and so on. This is the comfortable ceiling. Trying to fit k8s-ha into a 32 GB machine is impossible; trying to fit it into 48 GB is tight; 64 GB is the sweet spot.
Multi-client math
Two clients on a 128 GB workstation:
| Client | Topology | RAM |
|---|---|---|
| Acme | k8s-multi |
32 GB |
| Globex | k8s-multi |
32 GB |
| Personal | k8s-single |
16 GB |
| Host (OS + apps) | — | 16 GB |
| Total | 96 GB |
Or one client on HA, one on multi:
| Client | Topology | RAM |
|---|---|---|
| Acme | k8s-ha |
48 GB |
| Globex | k8s-multi |
32 GB |
| Host | — | 16 GB |
| Total | 96 GB |
Either way, a 128 GB machine has 32 GB of headroom for additional clients, ephemeral test instances, or burst capacity. The freelancer's nightmare scenario — three big clients all wanting attention on the same Tuesday — fits in 128 GB if at least one of them is willing to live on k8s-multi instead of k8s-ha.
The 128 GB workstation is achievable: any modern desktop motherboard supports 128 GB of DDR5, and 128 GB ECC kits are about €500 at the time of writing. For a freelancer billing four-figure rates, this is one day of work. The investment pays back the first time it lets you bill two clients in parallel without their work interfering.
What is not in the budget
Three things that consume RAM but are not counted in the budgets above:
- The build cache. When you
dotnet buildormvn packageon the host, the IDE's incremental compilation cache plus the test runner can use 4-8 GB of RAM. This is host-side, not cluster-side, and it cuts into the headroom on a 64 GB machine. Plan for it. - Browser tabs. Modern browsers happily use 8-16 GB of RAM for a typical developer's tab set. Same caveat: host-side, headroom-sensitive.
- The container images themselves. Pulling the GitLab Helm chart's 12 images, the kube-prometheus-stack's 8 images, ArgoCD's 4 images, Velero's 2 images, etc., uses disk (not RAM). Budget at least 30-50 GB of disk per cluster for the image cache. 100 GB of free disk per HA cluster is comfortable.
The disk numbers do not multiply with concurrent clusters because the images are pulled into each VM independently — there is no host-level dedup. Two clusters means two image caches.
Bottlenecks: what scales how
Three numbers scale linearly with cluster size:
- etcd memory scales with cluster object count, not node count. A small dev cluster with ~200 objects uses ~200 MB of etcd RAM. A medium cluster with ~5000 objects uses ~600 MB. The 200 MB / 500 MB peak figures above are for a small cluster.
- Longhorn instance manager scales with the number of volumes, ~50 MB per active volume.
- Prometheus scales with metric cardinality, ~1 GB per million active series. A single dev cluster typically has ~50k series and uses ~150 MB.
Two numbers scale sub-linearly:
- kube-apiserver flattens out around 800 MB once the watch fan-out is established; adding more nodes does not cost much.
- CNI agents are per-node but each agent's RAM is fixed regardless of cluster size.
For our purposes (real dev k8s, not production), the linear scalers do not bite hard because the object count is bounded by the small workload set. The total stays predictable.
What this gives you that toy k8s doesn't
A toy cluster fits in 4 GB and is, by definition, useless for the things in Part 01. A real cluster fits in 16-48 GB depending on topology and is useful for everything in this series. The 64 GB workstation is the inflection point at which real becomes practical for a single developer; the 128 GB workstation is the inflection point at which real becomes practical for a freelancer with multiple clients.
The hardware budget is not a constraint to be worked around. It is a design parameter that the topology resolver consumes. A user who edits one config field — topology: k8s-single | k8s-multi | k8s-ha — picks the budget. HomeLab's IK8sTopologyResolver (we will see it in Part 08) does the math, allocates the VMs, and reports the expected RAM usage before anything boots. The user knows what they are spending.