Skip to main content
Welcome. This site supports keyboard navigation and screen readers. Press ? at any time for keyboard shortcuts. Press [ to focus the sidebar, ] to focus the content. High-contrast themes are available via the toolbar.
serard@dev00:~/cv

Part 14: kubeadm init — Bootstrapping the Control Plane

"kubeadm init is one command. Getting it right is twenty configuration choices. We make those typed."


Why

kubeadm init is the one command that brings a Kubernetes control plane into existence. It is also the command with the most consequential configuration: the API server's advertise address, the pod CIDR, the service CIDR, the cluster name, the certificate parameters, the etcd location, the feature gates, the audit policy. Every choice locks in a property of the cluster that is expensive to change later.

The right way to call kubeadm init is via a typed ClusterConfiguration object passed as --config. The wrong way is a long command line with flags interpolated into a shell script. K8s.Dsl uses the typed config approach, generates it from the user's HomeLabConfig, and runs kubeadm init via a [BinaryWrapper] so the call itself is also typed.

The thesis: KubeadmClient is a [BinaryWrapper("kubeadm")] partial class. KubeadmConfigGenerator produces the YAML config from the typed HomeLabConfig. The whole bootstrap is wrapped in a KubeadmInitSaga from the Saga toolbelt library so partial failures (e.g. the API server starts but cert generation fails) roll back cleanly.


The wrapper

[BinaryWrapper("kubeadm", HelpCommand = "--help", VersionCommand = "version -o json")]
public partial class KubeadmClient : IKubeadmClient
{
    [Command("init")]
    public partial Task<Result<KubeadmInitOutput>> InitAsync(
        [Flag("--config")] string configPath,
        [Flag("--upload-certs", IsBoolean = true)] bool uploadCerts = true,
        [Flag("--ignore-preflight-errors")] string? ignorePreflight = null,
        CancellationToken ct = default);

    [Command("join")]
    public partial Task<Result<KubeadmJoinOutput>> JoinAsync(
        [PositionalArgument] string controlPlaneEndpoint,
        [Flag("--token")] string token,
        [Flag("--discovery-token-ca-cert-hash")] string caCertHash,
        [Flag("--control-plane", IsBoolean = true)] bool controlPlane = false,
        [Flag("--certificate-key")] string? certificateKey = null,
        CancellationToken ct = default);

    [Command("upgrade", SubCommand = "plan")]
    public partial Task<Result<KubeadmUpgradePlanOutput>> UpgradePlanAsync(CancellationToken ct = default);

    [Command("upgrade", SubCommand = "apply")]
    public partial Task<Result<KubeadmUpgradeApplyOutput>> UpgradeApplyAsync(
        [PositionalArgument] string version,
        [Flag("--yes", IsBoolean = true)] bool yes = true,
        CancellationToken ct = default);

    [Command("token", SubCommand = "create")]
    public partial Task<Result<KubeadmTokenCreateOutput>> TokenCreateAsync(
        [Flag("--ttl")] string? ttl = "24h",
        [Flag("--print-join-command", IsBoolean = true)] bool printJoinCommand = true,
        CancellationToken ct = default);

    [Command("reset")]
    public partial Task<Result<KubeadmResetOutput>> ResetAsync(
        [Flag("--force", IsBoolean = true)] bool force = true,
        CancellationToken ct = default);
}

The source generator emits the implementation: each method becomes a Process.Start call against kubeadm with the right argument list, exit code wrapped in Result<T>, stdout parsed into the typed output record. The pattern is identical to the wrappers from homelab-docker Part 15.

The config generator

[Injectable(ServiceLifetime.Singleton)]
public sealed class KubeadmConfigGenerator
{
    public string GenerateInitConfig(ControlPlaneSpec spec)
    {
        var initConfig = new KubeadmInitConfiguration
        {
            ApiVersion = "kubeadm.k8s.io/v1beta4",
            Kind = "InitConfiguration",
            BootstrapTokens = new[]
            {
                new BootstrapToken
                {
                    Groups = new[] { "system:bootstrappers:kubeadm:default-node-token" },
                    Token = "abcdef.0123456789abcdef",
                    Ttl = "24h",
                    Usages = new[] { "signing", "authentication" }
                }
            },
            LocalApiEndpoint = new LocalApiEndpoint
            {
                AdvertiseAddress = spec.ApiAdvertiseAddress,
                BindPort = spec.ApiPort
            },
            NodeRegistration = new NodeRegistration
            {
                CriSocket = "unix:///run/containerd/containerd.sock",
                KubeletExtraArgs = new Dictionary<string, string>
                {
                    ["node-ip"] = spec.ApiAdvertiseAddress
                }
            }
        };

        var clusterConfig = new ClusterConfiguration
        {
            ApiVersion = "kubeadm.k8s.io/v1beta4",
            Kind = "ClusterConfiguration",
            KubernetesVersion = spec.Version,
            ClusterName = spec.ClusterName,
            Networking = new ClusterNetworking
            {
                PodSubnet = spec.PodSubnet,
                ServiceSubnet = spec.ServiceSubnet,
                DnsDomain = $"{spec.ClusterName}.local"
            },
            ControlPlaneEndpoint = spec.IsHa ? $"{spec.ApiVip}:{spec.ApiPort}" : $"{spec.ApiAdvertiseAddress}:{spec.ApiPort}",
            ApiServer = new ApiServerConfig
            {
                ExtraArgs = new Dictionary<string, string>
                {
                    ["authorization-mode"] = "Node,RBAC",
                    ["audit-log-path"] = "/var/log/kubernetes/audit.log",
                    ["audit-log-maxsize"] = "100",
                    ["audit-log-maxbackup"] = "5"
                },
                CertSans = spec.IsHa ? new[] { spec.ApiVip!, spec.ApiAdvertiseAddress } : new[] { spec.ApiAdvertiseAddress }
            },
            ControllerManager = new ControllerManagerConfig
            {
                ExtraArgs = new Dictionary<string, string>
                {
                    ["bind-address"] = "0.0.0.0"
                }
            },
            Scheduler = new SchedulerConfig
            {
                ExtraArgs = new Dictionary<string, string>
                {
                    ["bind-address"] = "0.0.0.0"
                }
            },
            Etcd = new EtcdConfig
            {
                Local = new LocalEtcd
                {
                    DataDir = "/var/lib/etcd"
                }
            }
        };

        var kubeletConfig = new KubeletConfiguration
        {
            ApiVersion = "kubelet.config.k8s.io/v1beta1",
            Kind = "KubeletConfiguration",
            CgroupDriver = "systemd",
            ClusterDns = new[] { "10.96.0.10" },
            ClusterDomain = $"{spec.ClusterName}.local"
        };

        // kubeadm config files are multi-document YAML separated by ---
        var serializer = new KubernetesYamlSerializer();
        return serializer.SerializeMultiDocument(initConfig, clusterConfig, kubeletConfig);
    }
}

The generator produces a multi-document YAML that kubeadm init --config consumes. Every field is typed; the YAML is the output of typed C# objects, never the input.


The Saga

kubeadm init has multiple side effects: writing certificates to /etc/kubernetes/pki/, starting etcd, starting the API server, generating an admin kubeconfig, uploading certs to a bootstrap secret. If any of them fails partway, the resulting state is corrupt and cannot be retried without first running kubeadm reset to clean up. The user-facing CLI verb wraps the whole flow in a Saga so partial failures roll back automatically:

[Saga]
public sealed class KubeadmInitSaga
{
    [SagaStep(Order = 1, Compensation = nameof(NothingToCompensate))]
    public async Task<Result> WriteConfigFile(KubeadmInitContext ctx, CancellationToken ct)
    {
        var config = _generator.GenerateInitConfig(ctx.Spec);
        return await _vagrant.SshFileWriteAsync(ctx.NodeName, ctx.ConfigPath, config, ct);
    }

    [SagaStep(Order = 2, Compensation = nameof(KubeadmReset))]
    public async Task<Result> RunKubeadmInit(KubeadmInitContext ctx, CancellationToken ct)
    {
        var result = await _vagrant.SshCommandAsync(
            ctx.NodeName,
            $"sudo kubeadm init --config {ctx.ConfigPath} --upload-certs",
            ct);
        if (result.IsSuccess)
        {
            var (token, hash) = KubeadmOutputParser.ParseJoinDetails(result.Value.StdOut);
            ctx.JoinToken = token;
            ctx.CaCertHash = hash;
        }
        return result.Map();
    }

    [SagaStep(Order = 3, Compensation = nameof(NothingToCompensate))]
    public async Task<Result> WaitForApiServerHealthy(KubeadmInitContext ctx, CancellationToken ct)
    {
        for (var i = 0; i < 60; i++)
        {
            var result = await _vagrant.SshCommandAsync(
                ctx.NodeName,
                "sudo kubectl --kubeconfig=/etc/kubernetes/admin.conf get --raw /readyz",
                ct);
            if (result.IsSuccess && result.Value.StdOut.Contains("ok"))
                return Result.Success();
            await Task.Delay(TimeSpan.FromSeconds(2), ct);
        }
        return Result.Failure("API server did not become healthy within 120 seconds");
    }

    [SagaStep(Order = 4, Compensation = nameof(NothingToCompensate))]
    public async Task<Result> ExportKubeconfig(KubeadmInitContext ctx, CancellationToken ct)
    {
        var result = await _vagrant.SshCommandAsync(
            ctx.NodeName,
            "sudo cat /etc/kubernetes/admin.conf",
            ct);
        if (result.IsFailure) return result.Map();

        ctx.Kubeconfig = KubeconfigParser.ParseAdminConfig(result.Value.StdOut, ctx.Spec.ClusterName);
        return await _kubeconfigStore.WriteAsync(ctx.Kubeconfig, ct);
    }

    public async Task<Result> KubeadmReset(KubeadmInitContext ctx, CancellationToken ct)
    {
        // The compensation: leave the node in a clean state so a retry can succeed
        return await _vagrant.SshCommandAsync(
            ctx.NodeName,
            "sudo kubeadm reset --force && sudo rm -rf /etc/cni/net.d /var/lib/etcd",
            ct).Map();
    }

    public Task<Result> NothingToCompensate(KubeadmInitContext ctx, CancellationToken ct)
        => Task.FromResult(Result.Success());
}

Four steps. If step 3 (the readiness wait) fails after step 2 (the actual kubeadm init), the saga calls KubeadmReset to wipe the half-baked state. The user is left with a clean node they can homelab k8s create again on.

The Saga source generator from the toolbelt emits the orchestration code. The author writes the steps and the compensations; the framework handles the order, the cancellation, the event publication.


The handler

[Injectable(ServiceLifetime.Singleton)]
public sealed class K8sCreateRequestHandler : IRequestHandler<K8sCreateRequest, Result<K8sCreateResponse>>
{
    private readonly IClusterDistributionResolver _distributions;
    private readonly KubeadmInitSaga _initSaga;
    private readonly IK8sTopologyResolver _topology;
    private readonly IHomeLabEventBus _events;
    private readonly IClock _clock;

    public async Task<Result<K8sCreateResponse>> HandleAsync(K8sCreateRequest req, CancellationToken ct)
    {
        var distribution = _distributions.Resolve();
        if (distribution.Name != "kubeadm")
            return Result.Failure<K8sCreateResponse>($"K8sCreateRequestHandler only handles kubeadm; got {distribution.Name}");

        var spec = req.ToControlPlaneSpec();
        var sagaCtx = new KubeadmInitContext
        {
            NodeName = $"{spec.ClusterName}-cp-1",
            ConfigPath = $"/tmp/kubeadm-init-{spec.ClusterName}.yaml",
            Spec = spec
        };

        var sagaResult = await _initSaga.RunAsync(sagaCtx, ct);
        if (sagaResult.IsFailure) return sagaResult.Map<K8sCreateResponse>();

        await _events.PublishAsync(new ClusterCreated(spec.ClusterName, "kubeadm", _clock.UtcNow), ct);
        return Result.Success(new K8sCreateResponse(
            ClusterName: spec.ClusterName,
            JoinToken: sagaCtx.JoinToken!,
            ApiServerUrl: $"https://{spec.ApiAdvertiseAddress}:{spec.ApiPort}"));
    }
}

The handler is the standard pattern: resolve the distribution, build the spec, run the saga, publish the event, return the response.

For the k3s distribution there is a parallel K3sInstallSaga that runs the curl | sh flow with its own four steps and compensations. The handler picks the right saga based on the active distribution.


The test

[Fact]
public async Task kubeadm_init_saga_compensates_when_init_step_fails()
{
    var vagrant = new ScriptedVosBackend();
    vagrant.OnSshFileWrite(_ => true, _ => Result.Success());
    vagrant.OnSshCommand("acme-cp-1",
        cmd => cmd.Contains("kubeadm init"),
        exitCode: 1,
        stderr: "preflight check failed: port 10259 is in use");
    vagrant.OnSshCommand("acme-cp-1",
        cmd => cmd.Contains("kubeadm reset"),
        exitCode: 0);

    var saga = new KubeadmInitSaga(/* generator, vagrant, kubeconfigStore, events */);
    var ctx = new KubeadmInitContext
    {
        NodeName = "acme-cp-1",
        ConfigPath = "/tmp/init.yaml",
        Spec = StandardControlPlaneSpec("acme")
    };

    var result = await saga.RunAsync(ctx, default);

    result.IsFailure.Should().BeTrue();
    vagrant.Calls.Should().Contain(c => c.Command.Contains("kubeadm reset"));
}

[Fact]
public void config_generator_emits_typed_cluster_configuration_with_correct_pod_subnet()
{
    var generator = new KubeadmConfigGenerator();
    var spec = new ControlPlaneSpec(
        ClusterName: "acme",
        ApiAdvertiseAddress: "192.168.60.10",
        ApiPort: 6443,
        PodSubnet: "10.244.0.0/16",
        ServiceSubnet: "10.96.0.0/12",
        Version: "v1.31.4",
        IsHa: false,
        ApiVip: null);

    var yaml = generator.GenerateInitConfig(spec);

    yaml.Should().Contain("kind: ClusterConfiguration");
    yaml.Should().Contain("kubernetesVersion: v1.31.4");
    yaml.Should().Contain("podSubnet: 10.244.0.0/16");
    yaml.Should().Contain("serviceSubnet: 10.96.0.0/12");
    yaml.Should().Contain("clusterName: acme");
    yaml.Should().Contain("audit-log-path");
}

[Fact]
public void ha_cluster_uses_vip_in_control_plane_endpoint()
{
    var generator = new KubeadmConfigGenerator();
    var spec = new ControlPlaneSpec(
        ClusterName: "acme",
        ApiAdvertiseAddress: "192.168.60.11",
        ApiPort: 6443,
        PodSubnet: "10.244.0.0/16",
        ServiceSubnet: "10.96.0.0/12",
        Version: "v1.31.4",
        IsHa: true,
        ApiVip: "192.168.60.10");

    var yaml = generator.GenerateInitConfig(spec);

    yaml.Should().Contain("controlPlaneEndpoint: 192.168.60.10:6443");
    yaml.Should().Contain("certSANs:");
    yaml.Should().Contain("- 192.168.60.10");
    yaml.Should().Contain("- 192.168.60.11");
}

What this gives you that a hand-rolled kubeadm init script doesn't

A hand-rolled script is sudo kubeadm init --apiserver-advertise-address=$IP --pod-network-cidr=$POD_CIDR ... followed by if [ $? -ne 0 ]; then .... Every flag is a string. Every retry forgets to clean up. Every failure leaves the node in a half-bootstrapped state that the next attempt has to detect and kubeadm reset itself.

A typed KubeadmClient + KubeadmConfigGenerator + KubeadmInitSaga gives you, for the same surface area:

  • Typed config generated from the typed HomeLabConfig
  • kubeadm flags as method parameters instead of shell strings
  • Result<T> from every step with clear error messages
  • Saga compensation that wipes half-baked state on failure
  • Tests that exercise the saga without spawning real VMs

The bargain pays back the first time kubeadm init fails on a preflight check and the saga rolls back cleanly, leaving you with a clean node that the next homelab k8s create can succeed on without manual intervention.


⬇ Download