Part 38: Cross-Client Networking — Or the Lack of It
"Acme cannot reach Globex. Not because we asked it not to, but because the architecture refuses to give it the addresses."
Why
Multi-client isolation works on five layers (Part 03). The most important — and the easiest to get wrong — is layer 1: the network. If Acme's pods can ping Globex's pods, the rest of the isolation story collapses, because a curious or malicious workload in Acme can reach into Globex.
The thesis: HomeLab K8s achieves cross-client network isolation structurally: each cluster's VMs are on a separate Vagrant private network with a different /24 subnet, the host machine routes between them only because the host knows about both networks, and an architecture test asserts that no cluster's pod CIDR or service CIDR overlaps another cluster's host network. There is no firewall rule. The isolation is by non-existence of routes.
How it works
Each HomeLab instance has a subnet field in ~/.homelab/instances.json. The IInstanceRegistry allocates non-overlapping /24s. The Vagrant Vagrantfile uses that subnet for the private_network option, which creates a host-only adapter on the hypervisor:
# Vagrantfile (generated for instance 'acme')
config.vm.define "acme-cp-1" do |vm|
vm.vm.network "private_network", ip: "192.168.60.10"
end# Vagrantfile (generated for instance 'acme')
config.vm.define "acme-cp-1" do |vm|
vm.vm.network "private_network", ip: "192.168.60.10"
endVirtualBox creates a host-only network vboxnet60 on the host machine, with the host as 192.168.60.1 and the VMs at 192.168.60.10..23. The host can reach the VMs via this interface; the VMs cannot reach VMs on a different host-only network because the host does not enable IP forwarding between them.
For Globex on subnet 192.168.61.0/24, VirtualBox creates a separate vboxnet61. Acme's VMs at 192.168.60.x and Globex's VMs at 192.168.61.x are in separate L2 broadcast domains. They cannot ARP each other. They cannot ping each other. They cannot route to each other unless the host explicitly enables forwarding (which it does not).
This is the simplest possible isolation primitive: they have no path to each other. Firewall rules can be misconfigured. Routes can be added by accident. Lack of routes is permanent until someone explicitly adds them.
What about the cluster pod CIDRs
Each k8s cluster has its own pod CIDR (e.g. 10.244.0.0/16 for Acme, 10.245.0.0/16 for Globex). These CIDRs are internal to the cluster. Pods in Acme can reach 10.244.x.y (other Acme pods); they cannot reach 10.245.x.y (Globex pods) because:
- Acme's CNI (Cilium) does not have routes to
10.245.0.0/16 - Even if it did, Globex's CNI does not advertise its pods on the host network
- Even if both did, the host machine's routing table does not forward between
vboxnet60andvboxnet61
Three layers of "no path". The architecture test asserts the second one (no CIDR overlap):
[Fact]
public void no_two_instances_have_overlapping_pod_cidrs()
{
var registry = new InMemoryInstanceRegistry();
foreach (var inst in TestData.MultipleK8sInstances())
registry.Add(inst);
var allCidrs = registry.GetAll().Select(i => (i.Name, Cidr: i.K8s!.PodCidr)).ToList();
var pairs = from a in allCidrs
from b in allCidrs
where a.Name != b.Name
select (a, b);
foreach (var (a, b) in pairs)
{
CidrUtils.Overlap(a.Cidr, b.Cidr).Should().BeFalse(
$"instance {a.Name} ({a.Cidr}) overlaps {b.Name} ({b.Cidr})");
}
}[Fact]
public void no_two_instances_have_overlapping_pod_cidrs()
{
var registry = new InMemoryInstanceRegistry();
foreach (var inst in TestData.MultipleK8sInstances())
registry.Add(inst);
var allCidrs = registry.GetAll().Select(i => (i.Name, Cidr: i.K8s!.PodCidr)).ToList();
var pairs = from a in allCidrs
from b in allCidrs
where a.Name != b.Name
select (a, b);
foreach (var (a, b) in pairs)
{
CidrUtils.Overlap(a.Cidr, b.Cidr).Should().BeFalse(
$"instance {a.Name} ({a.Cidr}) overlaps {b.Name} ({b.Cidr})");
}
}The instance registry allocates pod CIDRs from the 10.244.0.0/12 reserved space, one /16 per instance. With 16 bits of subnet space and one /16 per instance, the pool supports 16 instances before exhaustion.
What about the service CIDRs
Same story. Each cluster gets a service CIDR from 10.96.0.0/12. The registry allocates one /16 per instance: Acme gets 10.96.0.0/16, Globex gets 10.97.0.0/16. The service CIDRs are also internal to the cluster and unreachable from outside via routing.
What if the user actually wants two clients to talk
They do not. The whole point of multi-client isolation is that they should never talk. If Acme needs to call a Globex API, it does so via the public ingress over https://api.globex.lab from outside the cluster, just like a real production caller would.
But: for the rare case where the user wants to test cross-cluster federation (an extremely advanced scenario), HomeLab K8s ships an opt-in homelab k8s federate <a> <b> verb that:
- Adds a route on the host between the two
/24subnets - Adds a NetworkPolicy in each cluster allowing the other's pod CIDR
- Tags the federation in the instance registry with
federated: [acme, globex]
This is off by default. The user has to explicitly opt in. And the federation is logged loudly in the event bus so the audit trail captures who federated what when.
The audit test
[Fact]
[Trait("category", "e2e")]
[Trait("category", "slow")]
public async Task acme_cannot_reach_globex_from_inside_a_pod()
{
using var fixture = await MultiClusterFixture.NewAsync(
clusters: new[] { ("acme", "k8s-multi"), ("globex", "k8s-multi") });
await fixture.UpAllAsync();
// Run a pod in acme that tries to reach a known Globex IP
var globexCpIp = fixture.Get("globex").VmIp("globex-cp-1");
var pingResult = await fixture.Get("acme").RunPod(
image: "alpine:3.21",
command: new[] { "sh", "-c", $"ping -c 3 -W 2 {globexCpIp} || echo UNREACHABLE" });
pingResult.Should().Contain("UNREACHABLE");
}
[Fact]
public void instance_registry_refuses_overlapping_subnets()
{
var registry = new FileBasedInstanceRegistry(TempPath());
registry.AcquireAsync("a", default).GetAwaiter().GetResult();
registry.AcquireAsync("b", default).GetAwaiter().GetResult();
var aSubnet = registry.GetAsync("a", default).GetAwaiter().GetResult().Value.Subnet;
var bSubnet = registry.GetAsync("b", default).GetAwaiter().GetResult().Value.Subnet;
aSubnet.Should().NotBe(bSubnet);
}[Fact]
[Trait("category", "e2e")]
[Trait("category", "slow")]
public async Task acme_cannot_reach_globex_from_inside_a_pod()
{
using var fixture = await MultiClusterFixture.NewAsync(
clusters: new[] { ("acme", "k8s-multi"), ("globex", "k8s-multi") });
await fixture.UpAllAsync();
// Run a pod in acme that tries to reach a known Globex IP
var globexCpIp = fixture.Get("globex").VmIp("globex-cp-1");
var pingResult = await fixture.Get("acme").RunPod(
image: "alpine:3.21",
command: new[] { "sh", "-c", $"ping -c 3 -W 2 {globexCpIp} || echo UNREACHABLE" });
pingResult.Should().Contain("UNREACHABLE");
}
[Fact]
public void instance_registry_refuses_overlapping_subnets()
{
var registry = new FileBasedInstanceRegistry(TempPath());
registry.AcquireAsync("a", default).GetAwaiter().GetResult();
registry.AcquireAsync("b", default).GetAwaiter().GetResult();
var aSubnet = registry.GetAsync("a", default).GetAwaiter().GetResult().Value.Subnet;
var bSubnet = registry.GetAsync("b", default).GetAwaiter().GetResult().Value.Subnet;
aSubnet.Should().NotBe(bSubnet);
}Two tests. The first is the real assertion that pings cannot cross. It requires both clusters to be running, so it is slow and runs nightly. The second is the unit-level invariant on the registry, runs in milliseconds, and exercises the allocation path.
What this gives you that "trust the namespace boundary" doesn't
Trusting the namespace boundary means assuming that a misconfigured pod in acme-dev cannot reach a service in globex-prod. The assumption fails the moment something is misconfigured. The boundary is enforced by the CNI's NetworkPolicy implementation, which can have bugs, can be turned off, can be wrong.
Cross-client isolation by non-overlapping subnets and non-existent routes gives you, for the same surface area:
- Structural isolation that does not depend on policy enforcement
- The audit test that proves the isolation
- An opt-in federation verb for the rare case where you actually want to bridge two clusters
- A registry that refuses overlapping subnets so the architecture cannot drift into ambiguity
The bargain pays back the first time a curious developer in Acme tries to scan their network and finds nothing.