Skip to main content
Welcome. This site supports keyboard navigation and screen readers. Press ? at any time for keyboard shortcuts. Press [ to focus the sidebar, ] to focus the content. High-contrast themes are available via the toolbar.
serard@dev00:~/cv

Part VI: Build Time -- The Source Generator for CLI Commands

One Roslyn incremental generator, 40+ JSON files in, ~200 .g.cs files out -- in under 2 seconds.


From JSON to Typed C#

The design-time pipeline from Part III produced 40+ JSON files -- one per Docker version, each containing the full command tree. Now the Roslyn incremental source generator turns that data into typed C#. This is where the magic happens.

I have written about the BinaryWrapper generator architecture in general terms already -- see BinaryWrapper for the full pattern: how the attribute triggers generation, how AdditionalFiles feed data in, how the incremental pipeline avoids redundant work. This post is the concrete walk-through for Docker CLI. The specific generator, the specific merge algorithm, the specific emitters, the specific output. Not the abstract pattern -- the real code that turns 40 versioned command trees into a typed API.

Here is the full pipeline at a glance:

Diagram
The end-to-end generator pipeline — 40 JSON files funnelled through VersionDiffer into one unified tree, then split across three emitters that together produce the full typed Docker API in a single RegisterSourceOutput callback.

Forty JSON files enter the left side. One hundred seventy-two generated C# files exit the right side. The generator does all of it in a single RegisterSourceOutput callback, and the incremental pipeline ensures it only runs when something actually changes.

Let me walk through every stage.


The Entry Point: BinaryWrapperGenerator

The generator is a standard Roslyn incremental source generator -- a single class implementing IIncrementalGenerator, decorated with [Generator]. The Initialize method sets up three data pipelines and combines them:

[Generator]
public sealed class BinaryWrapperGenerator : IIncrementalGenerator
{
    public void Initialize(IncrementalGeneratorInitializationContext context)
    {
        // 1. Find [BinaryWrapper] descriptor classes
        var descriptors = context.SyntaxProvider
            .ForAttributeWithMetadataName(
                "FrenchExDev.Net.BinaryWrapper.Attributes.BinaryWrapperAttribute",
                predicate: static (node, _) => node is ClassDeclarationSyntax,
                transform: static (ctx, _) => ExtractDescriptor(ctx))
            .Where(static d => d is not null);

        // 2. Collect AdditionalFiles (JSON help files)
        var jsonFiles = context.AdditionalTextsProvider
            .Where(static f => f.Path.EndsWith(".json", StringComparison.OrdinalIgnoreCase));

        // 3. Combine and generate
        var combined = descriptors.Combine(jsonFiles.Collect());
        context.RegisterSourceOutput(combined, static (spc, source) =>
        {
            var (descriptor, files) = source;
            if (descriptor is null) return;
            Generate(spc, descriptor, files);
        });
    }
}

Three things happen here, and the order matters.

Step 1: Discover Descriptors

ForAttributeWithMetadataName is the Roslyn API that says: "find me every class in the compilation that has this specific attribute applied." The fully qualified name FrenchExDev.Net.BinaryWrapper.Attributes.BinaryWrapperAttribute is the trigger. When the compiler encounters a class like this:

[BinaryWrapper("docker")]
public partial class DockerDescriptor : IBinaryDescriptor
{
    public string BinaryName => "docker";
    public string DisplayName => "Docker CLI";
}

...the ForAttributeWithMetadataName pipeline fires. The predicate confirms the node is a ClassDeclarationSyntax (not a struct, not an interface). The transform calls ExtractDescriptor, which reads the attribute's constructor argument ("docker") and the class's namespace, accessibility, and any configuration properties.

The key detail: ForAttributeWithMetadataName is Roslyn's most targeted discovery API. It does not scan the entire syntax tree. It leverages the compiler's internal attribute index, which means it is essentially free -- even in a solution with thousands of files. The generator only gets invoked when a class with this exact attribute exists.

Step 2: Collect JSON Files

AdditionalTextsProvider gives access to every file listed as <AdditionalFiles> in the .csproj. The Where clause filters to .json files only. These are the scraped command trees -- docker-18.09.9.json, docker-20.10.27.json, docker-27.1.0.json, and so on.

The .csproj includes them with a glob:

<ItemGroup>
    <AdditionalFiles Include="scrape/docker-*.json" />
</ItemGroup>

That one line is what connects the design-time output to the build-time generator. No custom MSBuild targets, no intermediate transforms. The files are there, the generator reads them.

Step 3: Combine and Generate

The Combine + Collect pattern is the standard way to join a single-valued pipeline (one descriptor) with a multi-valued pipeline (many JSON files). jsonFiles.Collect() gathers all matching JSON files into an ImmutableArray<AdditionalText>. descriptors.Combine(...) pairs each descriptor with the full array of JSON files.

The result: for each [BinaryWrapper] descriptor in the compilation, the generator receives the descriptor plus every JSON file. If there are two descriptors -- say, [BinaryWrapper("docker")] and [BinaryWrapper("docker-compose")] -- the callback fires twice, once for each, with the same JSON file array. Each invocation filters the JSON files to its own binary name.

private static void Generate(
    SourceProductionContext spc,
    DescriptorInfo descriptor,
    ImmutableArray<AdditionalText> allJsonFiles)
{
    // Filter JSON files matching this descriptor's binary name
    var matchingFiles = allJsonFiles
        .Where(f => MatchesBinary(f.Path, descriptor.BinaryName))
        .ToList();

    if (matchingFiles.Count == 0) return;

    // Read and deserialize all matching JSON files
    var versionedTrees = matchingFiles
        .Select(f => CommandTreeReader.Read(f))
        .Where(t => t is not null)
        .Select(t => t!)
        .ToList();

    // Merge all versioned trees into one unified tree
    var unified = VersionDiffer.Merge(versionedTrees);

    // Emit generated source files
    CommandClassEmitter.Emit(spc, descriptor, unified);
    BuilderClassEmitter.Emit(spc, descriptor, unified);
    ClientClassEmitter.Emit(spc, descriptor, unified);
}

That is the entire generation entry point. Filter, read, merge, emit. Everything downstream is a function of the unified command tree.


JSON Discovery and Matching

The naming convention is the contract between design time and build time. The scraper produces files named {binaryName}-{version}.json. The generator matches them by extracting the binary name prefix:

private static bool MatchesBinary(string filePath, string binaryName)
{
    var fileName = Path.GetFileNameWithoutExtension(filePath);
    // docker-24.0.9 -> prefix "docker"
    // docker-compose-2.30.0 -> prefix "docker-compose"
    var lastDash = fileName.LastIndexOf('-');
    if (lastDash < 0) return false;

    var prefix = fileName[..lastDash];
    return string.Equals(prefix, binaryName, StringComparison.OrdinalIgnoreCase);
}

docker-24.0.9.json matches [BinaryWrapper("docker")]. docker-compose-2.30.0.json matches [BinaryWrapper("docker-compose")]. The LastIndexOf('-') handles the hyphenated binary name correctly -- it splits on the last dash, which separates the binary name from the version.

What happens when no JSON files match? Nothing. No generation, no error, no warning. The project compiles normally, it just has no generated types. This is intentional -- the generator is designed to be inert until data is available. A developer can add [BinaryWrapper("docker")] to a project, forget to include the JSON files, and still compile. The missing types surface as normal compilation errors downstream ("type DockerClient not found"), which point clearly at the missing data.


CommandTreeReader

Each JSON file is a complete command tree for one Docker version. The CommandTreeReader deserializes it into the in-memory model:

public static class CommandTreeReader
{
    private static readonly JsonSerializerOptions Options = new()
    {
        PropertyNameCaseInsensitive = true,
        Converters = { new OptionValueKindConverter() }
    };

    public static (SemanticVersion Version, CommandTree Tree)? Read(
        AdditionalText file)
    {
        var text = file.GetText()?.ToString();
        if (string.IsNullOrEmpty(text)) return null;

        var tree = JsonSerializer.Deserialize<CommandTree>(text, Options);
        if (tree is null) return null;

        var fileName = Path.GetFileNameWithoutExtension(file.Path);
        var versionStr = fileName[(fileName.LastIndexOf('-') + 1)..];
        if (!SemanticVersion.TryParse(versionStr, out var version))
            return null;

        return (version, tree);
    }
}

The model is straightforward:

public record CommandTree(
    string BinaryName,
    string Version,
    CommandNode Root);

public record CommandNode(
    string Name,
    string? Description,
    IReadOnlyList<CommandNode> SubCommands,
    IReadOnlyList<CommandOption> Options);

public record CommandOption(
    string LongName,
    string? ShortName,
    string? Description,
    string? DefaultValue,
    OptionValueKind ValueKind);

public enum OptionValueKind
{
    Flag,       // --detach (no value)
    String,     // --name <value>
    Int,        // --memory <bytes>
    StringList, // --env <key=value> (repeatable)
    MapList     // --label <key=value> (repeatable key-value)
}

Each JSON file deserializes into one CommandTree with one CommandNode root and a recursive tree of subcommands and options. The OptionValueKind enum drives the CLR type mapping later -- Flag becomes bool?, String becomes string?, Int becomes int?, StringList becomes IReadOnlyList<string>?, and MapList becomes IReadOnlyDictionary<string, string>?.

The reader is deliberately lenient. If a JSON file is malformed, it returns null and the generator skips it. No diagnostics, no build failures. The reasoning: a corrupt JSON file in the scrape/ directory is not a compilation error -- it is a data issue that should be fixed by re-running the scraper, not by failing the build. The generator treats its input as "best available data" and generates from whatever it can parse.


VersionDiffer.Merge() -- The Core Algorithm

This is the heart of the generator. Forty versioned command trees go in. One unified command tree comes out. Every node in the unified tree knows when it appeared and when it disappeared.

The Problem

Each JSON file contains the complete command tree for one Docker version. The trees are structurally similar -- Docker does not reinvent its CLI every release -- but they are not identical. Across 40 versions spanning seven years:

  • Commands are added. docker buildx appeared in Docker 19.03. docker scout appeared in 24.0.
  • Commands are removed (rare but it happens). docker swarm commands were deprecated and some removed.
  • Options are added to existing commands. docker container run --annotation appeared in 25.0.
  • Options are removed from existing commands. docker container run --link was deprecated.
  • Option types change. Some flags that accepted a string started accepting a list.
  • Default values change. Some options changed their default between versions.

The generator needs to produce a single set of C# types that covers all of this. A DockerContainerRunCommand class that has properties for every flag that ever existed, with metadata attributes indicating the version range for each one.

The Algorithm

The merge algorithm is conceptually simple: diff each version against the running unified tree, annotating additions and removals with version numbers.

public static class VersionDiffer
{
    public static UnifiedCommandTree Merge(
        IReadOnlyList<(SemanticVersion Version, CommandTree Tree)> versioned)
    {
        if (versioned.Count == 0)
            throw new ArgumentException("At least one version required");

        var sorted = versioned
            .OrderBy(v => v.Version)
            .ToList();

        // Start with the oldest version as the baseline
        var unified = UnifiedCommandTree.FromSingle(
            sorted[0].Version, sorted[0].Tree);

        // Merge each subsequent version into the unified tree
        for (int i = 1; i < sorted.Count; i++)
        {
            var (version, tree) = sorted[i];
            var previousVersion = sorted[i - 1].Version;
            MergeNode(unified.Root, tree.Root, version, previousVersion);
        }

        return unified;
    }
}

The oldest version is the baseline. Every command and option in Docker 18.09 gets SinceVersion = null -- null means "present since the earliest tracked version." Then for each subsequent version, in order, the algorithm walks the two trees in parallel:

private static void MergeNode(
    UnifiedCommandNode unified,
    CommandNode current,
    SemanticVersion version,
    SemanticVersion previousVersion)
{
    // Index the current version's subcommands by name
    var currentSubCommands = current.SubCommands
        .ToDictionary(c => c.Name, StringComparer.OrdinalIgnoreCase);

    // Walk existing unified subcommands
    foreach (var existing in unified.SubCommands.ToList())
    {
        if (currentSubCommands.TryGetValue(existing.Name, out var match))
        {
            // Command still exists -- recurse
            MergeNode(existing, match, version, previousVersion);
            currentSubCommands.Remove(existing.Name);
        }
        else if (existing.UntilVersion is null)
        {
            // Command missing in this version -- mark as removed
            existing.UntilVersion = previousVersion;
        }
    }

    // Add commands that are new in this version
    foreach (var (name, newCmd) in currentSubCommands)
    {
        unified.SubCommands.Add(
            UnifiedCommandNode.FromNew(newCmd, sinceVersion: version));
    }

    // Same logic for options
    MergeOptions(unified.Options, current.Options, version, previousVersion);
}

The option merge follows the same pattern:

private static void MergeOptions(
    List<UnifiedOption> unifiedOptions,
    IReadOnlyList<CommandOption> currentOptions,
    SemanticVersion version,
    SemanticVersion previousVersion)
{
    var currentByName = currentOptions
        .ToDictionary(o => o.LongName, StringComparer.OrdinalIgnoreCase);

    foreach (var existing in unifiedOptions)
    {
        if (currentByName.TryGetValue(existing.LongName, out var match))
        {
            // Option still exists -- check for type changes
            if (existing.ClrType != MapClrType(match.ValueKind))
            {
                existing.TypeChangedInVersion = version;
                existing.ClrType = MapClrType(match.ValueKind);
            }
            currentByName.Remove(existing.LongName);
        }
        else if (existing.UntilVersion is null)
        {
            existing.UntilVersion = previousVersion;
        }
    }

    foreach (var (name, newOpt) in currentByName)
    {
        unifiedOptions.Add(UnifiedOption.FromNew(newOpt, sinceVersion: version));
    }
}

Four scenarios for each node:

  1. Present in both trees -- recurse into subcommands, check options for changes. No version annotation needed.
  2. Present in unified, missing in current -- the command or option was removed. Set UntilVersion to the previous version (the last version where it existed).
  3. Missing in unified, present in current -- the command or option is new. Add it with SinceVersion set to the current version.
  4. Type changed -- the option exists in both but its value kind changed. Record the version where the type changed (this is rare but needs handling).

A Concrete Example

Here is what the merge looks like for docker container run across three versions:

v20.10.0:  docker container run  -- 48 flags
v23.0.0:   docker container run  -- 50 flags (+--platform, +--pull)
v25.0.0:   docker container run  -- 51 flags (+--annotation, ---link removed)

After merging:

Unified:   docker container run  -- 51 flags total
  --detach          SinceVersion: null       UntilVersion: null
  --name            SinceVersion: null       UntilVersion: null
  --platform        SinceVersion: 23.0.0     UntilVersion: null
  --pull            SinceVersion: 23.0.0     UntilVersion: null
  --annotation      SinceVersion: 25.0.0     UntilVersion: null
  --link            SinceVersion: null       UntilVersion: 23.0.0
  ... 45 more flags with SinceVersion: null, UntilVersion: null

The --link flag has UntilVersion: 23.0.0 -- it was present in v20.10 and absent in v25.0, so the last version where it existed was v23.0 (technically the previous version at the time of removal detection). The --platform and --pull flags have SinceVersion: 23.0.0 because they first appeared in that version.

Diagram
VersionDiffer.Merge at work on docker container run across three versions — each added flag gets a SinceVersion, each removed flag an UntilVersion, producing one unified node that drives every downstream emitter.

Edge Cases

The merge is not always clean. Here are the cases that required real engineering:

Re-added commands. Docker removed some experimental commands and then re-added them in later versions. The algorithm handles this by checking if a node with UntilVersion set reappears -- if so, it clears UntilVersion and adds a ReintroducedInVersion annotation. This is rare (it happened exactly twice across 40 Docker versions) but must be handled correctly or the generated code would have stale deprecation warnings.

Renamed options. Docker occasionally renames a flag. --memory-swappiness did not change, but some less common flags had spelling corrections. The algorithm treats a rename as a removal plus an addition -- two separate options, each with their own version range. The generated code will have both properties, which is correct: old code targeting old Docker versions uses the old property name, new code uses the new one.

Inconsistent descriptions. The help text for the same flag sometimes changes between versions. The merge takes the description from the latest version that has the option. This is a pragmatic choice -- the latest description is usually the most accurate.


The Unified Command Tree Model

The output of VersionDiffer.Merge() is an UnifiedCommandTree -- a tree of UnifiedCommandNode objects, each with optional version annotations:

public class UnifiedCommandTree
{
    public string BinaryName { get; set; }
    public SemanticVersion OldestVersion { get; set; }
    public SemanticVersion NewestVersion { get; set; }
    public int VersionCount { get; set; }
    public UnifiedCommandNode Root { get; set; }

    public static UnifiedCommandTree FromSingle(
        SemanticVersion version, CommandTree tree)
    {
        return new UnifiedCommandTree
        {
            BinaryName = tree.BinaryName,
            OldestVersion = version,
            NewestVersion = version,
            VersionCount = 1,
            Root = UnifiedCommandNode.FromBaseline(tree.Root)
        };
    }
}
public class UnifiedCommandNode
{
    public string Name { get; set; }
    public string? Description { get; set; }
    public SemanticVersion? SinceVersion { get; set; }
    public SemanticVersion? UntilVersion { get; set; }
    public SemanticVersion? ReintroducedInVersion { get; set; }
    public List<UnifiedCommandNode> SubCommands { get; set; } = new();
    public List<UnifiedOption> Options { get; set; } = new();

    public bool IsLeafCommand => SubCommands.Count == 0;
    public bool IsDeprecated => UntilVersion is not null;

    public static UnifiedCommandNode FromNew(
        CommandNode node, SemanticVersion sinceVersion)
    {
        var unified = new UnifiedCommandNode
        {
            Name = node.Name,
            Description = node.Description,
            SinceVersion = sinceVersion
        };

        foreach (var sub in node.SubCommands)
            unified.SubCommands.Add(FromNew(sub, sinceVersion));

        foreach (var opt in node.Options)
            unified.Options.Add(UnifiedOption.FromNew(opt, sinceVersion));

        return unified;
    }

    public static UnifiedCommandNode FromBaseline(CommandNode node)
    {
        var unified = new UnifiedCommandNode
        {
            Name = node.Name,
            Description = node.Description,
            SinceVersion = null // baseline: no "since" annotation
        };

        foreach (var sub in node.SubCommands)
            unified.SubCommands.Add(FromBaseline(sub));

        foreach (var opt in node.Options)
            unified.Options.Add(UnifiedOption.FromBaseline(opt));

        return unified;
    }
}
public class UnifiedOption
{
    public string LongName { get; set; }
    public string? ShortName { get; set; }
    public string? Description { get; set; }
    public string ClrType { get; set; }
    public OptionValueKind ValueKind { get; set; }
    public string? DefaultValue { get; set; }
    public SemanticVersion? SinceVersion { get; set; }
    public SemanticVersion? UntilVersion { get; set; }
    public SemanticVersion? TypeChangedInVersion { get; set; }

    public bool IsDeprecated => UntilVersion is not null;

    public string PropertyName => NamingConventions.ToPascalCase(LongName);

    public static UnifiedOption FromNew(
        CommandOption opt, SemanticVersion sinceVersion) => new()
    {
        LongName = opt.LongName,
        ShortName = opt.ShortName,
        Description = opt.Description,
        ClrType = MapClrType(opt.ValueKind),
        ValueKind = opt.ValueKind,
        DefaultValue = opt.DefaultValue,
        SinceVersion = sinceVersion
    };

    public static UnifiedOption FromBaseline(CommandOption opt) => new()
    {
        LongName = opt.LongName,
        ShortName = opt.ShortName,
        Description = opt.Description,
        ClrType = MapClrType(opt.ValueKind),
        ValueKind = opt.ValueKind,
        DefaultValue = opt.DefaultValue,
        SinceVersion = null
    };

    private static string MapClrType(OptionValueKind kind) => kind switch
    {
        OptionValueKind.Flag => "bool?",
        OptionValueKind.String => "string?",
        OptionValueKind.Int => "int?",
        OptionValueKind.StringList => "IReadOnlyList<string>?",
        OptionValueKind.MapList => "IReadOnlyDictionary<string, string>?",
        _ => "string?"
    };
}

The model is mutable by design. The merge algorithm needs to update UntilVersion on existing nodes as it walks through versions. Immutability would require rebuilding the entire tree for every version diff, which is wasteful. The tree is built once, mutated during the merge, and then consumed read-only by the emitters. Nobody outside VersionDiffer ever modifies it.


Three Emitters

The unified tree is the input to three emitters. Each emitter walks the tree and produces generated source files via SourceProductionContext.AddSource(). They run sequentially because they share no state -- parallelism would complicate error handling without meaningful performance gain, since the bottleneck is Roslyn's own source-output tracking, not the emitters' string building.

Diagram
The class shapes the three emitters produce — commands implement ICliCommand, builders derive from AbstractBuilder, and the DockerClient exposes nested groups that hand out typed command builders.

Emitter 1: CommandClassEmitter

The CommandClassEmitter generates one sealed class per leaf command. A leaf command is a command node with no subcommands -- docker container run is a leaf, docker container is not (it has subcommands like run, ls, stop, etc.).

The emitter walks the unified tree, collects all leaf nodes, and for each one produces a .g.cs file containing a sealed class that implements ICliCommand. Every option on the command becomes a property. Every property has an XML doc comment. Options with SinceVersion get a [SinceVersion] attribute. Options with UntilVersion get an [Obsolete] attribute with a message explaining when the option was removed.

Here is a real excerpt from the generated DockerContainerRunCommand.g.cs:

// <auto-generated/>
#nullable enable

namespace FrenchExDev.Net.Docker.Commands;

/// <summary>
/// Create and run a new container.
/// </summary>
/// <remarks>
/// CLI equivalent: <c>docker container run [OPTIONS] IMAGE [COMMAND] [ARG...]</c>
/// </remarks>
public sealed partial class DockerContainerRunCommand : ICliCommand
{
    /// <summary>Run container in background and print container ID.</summary>
    public bool? Detach { get; init; }

    /// <summary>Assign a name to the container.</summary>
    public string? Name { get; init; }

    /// <summary>Publish a container's port(s) to the host.</summary>
    public IReadOnlyList<string>? Publish { get; init; }

    /// <summary>Set environment variables.</summary>
    public IReadOnlyList<string>? Env { get; init; }

    /// <summary>Bind mount a volume.</summary>
    public IReadOnlyList<string>? Volume { get; init; }

    /// <summary>Set platform if server is multi-platform capable.</summary>
    [SinceVersion("19.03.0")]
    public string? Platform { get; init; }

    /// <summary>Pull image before running ("always", "missing", "never").</summary>
    [SinceVersion("23.0.0")]
    public string? Pull { get; init; }

    /// <summary>Add an annotation to the container (passed through to the OCI runtime).</summary>
    [SinceVersion("25.0.0")]
    public IReadOnlyList<string>? Annotation { get; init; }

    /// <summary>Add link to another container.</summary>
    [Obsolete("Removed in Docker 23.0.0. Use user-defined networks instead.")]
    [UntilVersion("23.0.0")]
    public IReadOnlyList<string>? Link { get; init; }

    /// <summary>Restart policy to apply when a container exits.</summary>
    public string? Restart { get; init; }

    /// <summary>Memory limit.</summary>
    public string? Memory { get; init; }

    /// <summary>Number of CPUs.</summary>
    public string? Cpus { get; init; }

    /// <summary>Working directory inside the container.</summary>
    public string? Workdir { get; init; }

    /// <summary>Override the key sequence for detaching a container.</summary>
    public string? DetachKeys { get; init; }

    /// <summary>Add a custom host-to-IP mapping (host:ip).</summary>
    public IReadOnlyList<string>? AddHost { get; init; }

    // ... 38 more properties omitted for brevity

    /// <summary>The command path segments for this command.</summary>
    public IReadOnlyList<string> CommandPath => ["container", "run"];

    /// <summary>
    /// Serializes all set properties into CLI argument strings.
    /// </summary>
    public IReadOnlyList<string> ToArguments()
    {
        var args = new List<string>();

        // Flags (bool properties)
        if (Detach == true) args.Add("--detach");

        // String properties
        if (Name is not null) { args.Add("--name"); args.Add(Name); }
        if (Platform is not null) { args.Add("--platform"); args.Add(Platform); }
        if (Pull is not null) { args.Add("--pull"); args.Add(Pull); }
        if (Restart is not null) { args.Add("--restart"); args.Add(Restart); }
        if (Memory is not null) { args.Add("--memory"); args.Add(Memory); }
        if (Cpus is not null) { args.Add("--cpus"); args.Add(Cpus); }
        if (Workdir is not null) { args.Add("--workdir"); args.Add(Workdir); }
        if (DetachKeys is not null) { args.Add("--detach-keys"); args.Add(DetachKeys); }

        // List properties (each value emits the flag again)
        if (Publish is not null)
        {
            foreach (var item in Publish)
            {
                args.Add("--publish");
                args.Add(item);
            }
        }
        if (Env is not null)
        {
            foreach (var item in Env)
            {
                args.Add("--env");
                args.Add(item);
            }
        }
        if (Volume is not null)
        {
            foreach (var item in Volume)
            {
                args.Add("--volume");
                args.Add(item);
            }
        }
        if (Annotation is not null)
        {
            foreach (var item in Annotation)
            {
                args.Add("--annotation");
                args.Add(item);
            }
        }
        if (Link is not null)
        {
            foreach (var item in Link)
            {
                args.Add("--link");
                args.Add(item);
            }
        }
        if (AddHost is not null)
        {
            foreach (var item in AddHost)
            {
                args.Add("--add-host");
                args.Add(item);
            }
        }

        // ... remaining properties follow the same pattern

        return args;
    }
}

A few things to notice:

The class is sealed partial. Sealed because there is no reason to inherit from a generated command class -- the properties are the contract, and adding more properties via inheritance would break ToArguments(). Partial because the developer can add their own methods, computed properties, or convenience overloads in a hand-written partial file.

Every property is nullable and uses init setters. The command object is a value bag -- you set the properties you want, the rest stay null. The ToArguments() method only serializes non-null properties. This models the CLI correctly: every flag is optional, and omitting a flag means "use the default."

The ToArguments() method handles three patterns. Flags emit just the flag name. String/int properties emit the flag name and the value as separate arguments. List properties emit the flag name once per item. Map properties emit the flag name once per key-value pair. The generated code is verbose but correct -- no runtime reflection, no attribute scanning, no string interpolation.

Deprecated options get [Obsolete] with a message. The message includes the version where the option was removed and, where possible, the recommended replacement. This is a real compiler warning -- code that uses --link will get a squiggly line in the IDE.

The emitter code itself is a StringBuilder loop. No template engine, no T4, no Scriban. Just string appending in a well-organized method:

public static class CommandClassEmitter
{
    public static void Emit(
        SourceProductionContext spc,
        DescriptorInfo descriptor,
        UnifiedCommandTree tree)
    {
        foreach (var leaf in tree.Root.GetLeafCommands())
        {
            var className = NamingConventions.ToClassName(
                descriptor.DisplayName, leaf.CommandPath);
            // e.g., "DockerContainerRunCommand"

            var sb = new StringBuilder();
            sb.AppendLine("// <auto-generated/>");
            sb.AppendLine("#nullable enable");
            sb.AppendLine();
            sb.AppendLine($"namespace {descriptor.Namespace}.Commands;");
            sb.AppendLine();

            EmitXmlDoc(sb, leaf);
            sb.AppendLine($"public sealed partial class {className} : ICliCommand");
            sb.AppendLine("{");

            foreach (var option in leaf.Options)
            {
                EmitPropertyXmlDoc(sb, option);
                EmitVersionAttributes(sb, option);
                sb.AppendLine(
                    $"    public {option.ClrType} {option.PropertyName} " +
                    $"{{ get; init; }}");
                sb.AppendLine();
            }

            EmitCommandPath(sb, leaf);
            EmitToArguments(sb, leaf);

            sb.AppendLine("}");

            spc.AddSource(
                $"{className}.g.cs",
                SourceText.From(sb.ToString(), Encoding.UTF8));
        }
    }
}

Eighty-five leaf commands across Docker CLI produce eighty-five generated command classes. docker container run, docker container ls, docker container stop, docker image build, docker image pull, docker network create -- every leaf in the tree.

Emitter 2: BuilderClassEmitter

For every command class, the BuilderClassEmitter produces a matching builder class. The builder provides a fluent API with With*() methods and integrates the runtime version guard for version-sensitive options.

For the full AbstractBuilder<T> pattern and how Build() returns a Result<T>, see Builder Pattern. Here I will focus on what the generator produces.

Generated excerpt from DockerContainerRunCommandBuilder.g.cs:

// <auto-generated/>
#nullable enable

namespace FrenchExDev.Net.Docker.Builders;

/// <summary>
/// Fluent builder for <see cref="DockerContainerRunCommand"/>.
/// </summary>
public sealed partial class DockerContainerRunCommandBuilder
    : AbstractBuilder<DockerContainerRunCommand>
{
    private readonly SemanticVersion? _detectedVersion;

    public DockerContainerRunCommandBuilder(
        SemanticVersion? detectedVersion = null)
    {
        _detectedVersion = detectedVersion;
    }

    // --- Backing fields (one per option) ---
    public bool? Detach { get; set; }
    public string? Name { get; set; }
    public IReadOnlyList<string>? Publish { get; set; }
    public IReadOnlyList<string>? Env { get; set; }
    public IReadOnlyList<string>? Volume { get; set; }
    public string? Platform { get; set; }
    public string? Pull { get; set; }
    public IReadOnlyList<string>? Annotation { get; set; }
    public IReadOnlyList<string>? Link { get; set; }
    public string? Restart { get; set; }
    public string? Memory { get; set; }

    // ... remaining properties

    // --- Fluent setters ---

    /// <summary>Run container in background and print container ID.</summary>
    public DockerContainerRunCommandBuilder WithDetach(bool value = true)
    {
        Detach = value;
        return this;
    }

    /// <summary>Assign a name to the container.</summary>
    public DockerContainerRunCommandBuilder WithName(string value)
    {
        Name = value;
        return this;
    }

    /// <summary>Publish a container's port(s) to the host.</summary>
    public DockerContainerRunCommandBuilder WithPublish(
        params string[] values)
    {
        Publish = values.ToList();
        return this;
    }

    /// <summary>Set environment variables.</summary>
    public DockerContainerRunCommandBuilder WithEnv(
        params string[] values)
    {
        Env = values.ToList();
        return this;
    }

    /// <summary>Bind mount a volume.</summary>
    public DockerContainerRunCommandBuilder WithVolume(
        params string[] values)
    {
        Volume = values.ToList();
        return this;
    }

    /// <summary>Set platform if server is multi-platform capable.</summary>
    [SinceVersion("19.03.0")]
    public DockerContainerRunCommandBuilder WithPlatform(string value)
    {
        VersionGuard.EnsureOptionSupported(
            _detectedVersion,
            commandPath: "container.run",
            optionName: "platform",
            sinceVersion: new SemanticVersion(19, 3, 0),
            untilVersion: null);
        Platform = value;
        return this;
    }

    /// <summary>Pull image before running.</summary>
    [SinceVersion("23.0.0")]
    public DockerContainerRunCommandBuilder WithPull(string value)
    {
        VersionGuard.EnsureOptionSupported(
            _detectedVersion,
            commandPath: "container.run",
            optionName: "pull",
            sinceVersion: new SemanticVersion(23, 0, 0),
            untilVersion: null);
        Pull = value;
        return this;
    }

    /// <summary>Add an annotation to the container.</summary>
    [SinceVersion("25.0.0")]
    public DockerContainerRunCommandBuilder WithAnnotation(
        params string[] values)
    {
        VersionGuard.EnsureOptionSupported(
            _detectedVersion,
            commandPath: "container.run",
            optionName: "annotation",
            sinceVersion: new SemanticVersion(25, 0, 0),
            untilVersion: null);
        Annotation = values.ToList();
        return this;
    }

    /// <summary>Add link to another container.</summary>
    [Obsolete("Removed in Docker 23.0.0. Use user-defined networks.")]
    [UntilVersion("23.0.0")]
    public DockerContainerRunCommandBuilder WithLink(
        params string[] values)
    {
        VersionGuard.EnsureOptionSupported(
            _detectedVersion,
            commandPath: "container.run",
            optionName: "link",
            sinceVersion: null,
            untilVersion: new SemanticVersion(23, 0, 0));
        Link = values.ToList();
        return this;
    }

    // ... remaining With methods

    // --- Build ---

    protected override DockerContainerRunCommand CreateInstance() => new()
    {
        Detach = Detach,
        Name = Name,
        Publish = Publish,
        Env = Env,
        Volume = Volume,
        Platform = Platform,
        Pull = Pull,
        Annotation = Annotation,
        Link = Link,
        Restart = Restart,
        Memory = Memory,
        // ... remaining properties
    };
}

The critical detail is VersionGuard.EnsureOptionSupported(). This is the runtime version check. When the client detects the installed Docker version (via docker version --format '{{.Client.Version}}'), it passes that version to the builder constructor. Every With*() method on a version-sensitive option calls the guard. If you are running Docker 20.10 and you call WithPlatform(...), the guard sees that --platform requires 19.03.0 and lets it through. If you call WithAnnotation(...), the guard sees that --annotation requires 25.0.0 and throws an UnsupportedOptionException with a clear message: "Option 'annotation' on 'container.run' requires Docker 25.0.0 or later, but detected version is 20.10.27."

This is the version safety that Part I was missing. The guard is generated -- the version numbers are baked into the generated code from the unified tree. No JSON parsing at runtime, no configuration files, no reflection. The version check is a direct comparison of two SemanticVersion structs.

Options without SinceVersion or UntilVersion -- the ones that have been present in every tracked version -- do not get a guard call. No runtime cost for stable options.

Emitter 3: ClientClassEmitter

The ClientClassEmitter produces the top-level entry point: the typed client that developers actually use. It generates a static factory class, the client class itself, and nested group classes that mirror the Docker command hierarchy.

Generated excerpt from DockerClient.g.cs:

// <auto-generated/>
#nullable enable

namespace FrenchExDev.Net.Docker;

/// <summary>
/// Creates a new <see cref="DockerClient"/> bound to a Docker binary.
/// </summary>
public static class Docker
{
    /// <summary>
    /// Creates a new Docker client using the specified binary binding.
    /// </summary>
    public static DockerClient Create(BinaryBinding binding) => new(binding);

    /// <summary>
    /// Creates a new Docker client that discovers the Docker binary on PATH.
    /// </summary>
    public static DockerClient Create() => new(BinaryBinding.Discover("docker"));
}

/// <summary>
/// Typed client for the Docker CLI.
/// Generated from 40 versions (18.09.9 to 27.1.0).
/// </summary>
public partial class DockerClient : IDisposable
{
    private readonly BinaryBinding _binding;

    internal DockerClient(BinaryBinding binding) => _binding = binding;

    /// <summary>Manage containers.</summary>
    public DockerContainerGroup Container => new(_binding);

    /// <summary>Manage images.</summary>
    public DockerImageGroup Image => new(_binding);

    /// <summary>Manage networks.</summary>
    public DockerNetworkGroup Network => new(_binding);

    /// <summary>Manage volumes.</summary>
    public DockerVolumeGroup Volume => new(_binding);

    /// <summary>Manage Docker contexts.</summary>
    public DockerContextGroup Context => new(_binding);

    /// <summary>Manage plugins.</summary>
    public DockerPluginGroup Plugin => new(_binding);

    /// <summary>Manage Swarm.</summary>
    [UntilVersion("25.0.0")]
    public DockerSwarmGroup Swarm => new(_binding);

    /// <summary>Docker Scout (image analysis).</summary>
    [SinceVersion("24.0.0")]
    public DockerScoutGroup Scout => new(_binding);

    /// <summary>Manage builds (BuildKit).</summary>
    [SinceVersion("19.03.0")]
    public DockerBuildxGroup Buildx => new(_binding);

    // ... more groups

    public void Dispose() => _binding.Dispose();
}

Each group class contains methods for every leaf command in that group:

/// <summary>
/// Container management commands.
/// </summary>
public partial class DockerContainerGroup
{
    private readonly BinaryBinding _binding;

    internal DockerContainerGroup(BinaryBinding binding) => _binding = binding;

    /// <summary>Create and run a new container.</summary>
    public DockerContainerRunCommand Run(
        Action<DockerContainerRunCommandBuilder> configure)
    {
        var builder = new DockerContainerRunCommandBuilder(
            _binding.DetectedVersion);
        configure(builder);
        return builder.Build().ValueOrThrow();
    }

    /// <summary>List containers.</summary>
    public DockerContainerLsCommand Ls(
        Action<DockerContainerLsCommandBuilder> configure)
    {
        var builder = new DockerContainerLsCommandBuilder(
            _binding.DetectedVersion);
        configure(builder);
        return builder.Build().ValueOrThrow();
    }

    /// <summary>Stop one or more running containers.</summary>
    public DockerContainerStopCommand Stop(
        Action<DockerContainerStopCommandBuilder> configure)
    {
        var builder = new DockerContainerStopCommandBuilder(
            _binding.DetectedVersion);
        configure(builder);
        return builder.Build().ValueOrThrow();
    }

    /// <summary>Remove one or more containers.</summary>
    public DockerContainerRmCommand Rm(
        Action<DockerContainerRmCommandBuilder> configure)
    {
        var builder = new DockerContainerRmCommandBuilder(
            _binding.DetectedVersion);
        configure(builder);
        return builder.Build().ValueOrThrow();
    }

    /// <summary>Execute a command in a running container.</summary>
    public DockerContainerExecCommand Exec(
        Action<DockerContainerExecCommandBuilder> configure)
    {
        var builder = new DockerContainerExecCommandBuilder(
            _binding.DetectedVersion);
        configure(builder);
        return builder.Build().ValueOrThrow();
    }

    /// <summary>Fetch the logs of a container.</summary>
    public DockerContainerLogsCommand Logs(
        Action<DockerContainerLogsCommandBuilder> configure)
    {
        var builder = new DockerContainerLogsCommandBuilder(
            _binding.DetectedVersion);
        configure(builder);
        return builder.Build().ValueOrThrow();
    }

    /// <summary>Display a live stream of container(s) resource usage.</summary>
    public DockerContainerStatsCommand Stats(
        Action<DockerContainerStatsCommandBuilder> configure)
    {
        var builder = new DockerContainerStatsCommandBuilder(
            _binding.DetectedVersion);
        configure(builder);
        return builder.Build().ValueOrThrow();
    }

    /// <summary>Return low-level information on Docker objects.</summary>
    public DockerContainerInspectCommand Inspect(
        Action<DockerContainerInspectCommandBuilder> configure)
    {
        var builder = new DockerContainerInspectCommandBuilder(
            _binding.DetectedVersion);
        configure(builder);
        return builder.Build().ValueOrThrow();
    }

    // ... remaining container commands (cp, diff, export, kill,
    //     pause, port, rename, restart, top, unpause, update, wait)
}

The pattern is the same for every method: create a builder (passing the detected version), let the caller configure it, build the command, return it. The ValueOrThrow() call unwraps the Result<T> -- if validation failed (e.g., version guard rejection), it throws with a descriptive message. There is also a Build() variant that returns the Result<T> for callers who prefer to handle errors without exceptions.

Notice that the detected version flows through the entire stack. BinaryBinding.Discover("docker") runs docker version, parses the output, and stores the SemanticVersion. That version is passed to the group, which passes it to the builder, which passes it to every VersionGuard.EnsureOptionSupported() call. One version detection, propagated everywhere. No global state, no ambient context, no service locator.

The client emitter also handles nested groups for deeply nested command paths. docker buildx imagetools create generates a DockerBuildxGroup with a DockerBuildxImagetoolsGroup inside it, which has a Create() method. The nesting mirrors the CLI hierarchy exactly.


Naming Conventions

The generator translates CLI names to C# identifiers using a deterministic naming convention. This is not cosmetic -- it is load-bearing, because the names must be predictable so that developers can guess them without consulting documentation.

public static class NamingConventions
{
    /// <summary>
    /// Converts a CLI option name to a PascalCase property name.
    /// --memory-swap -> MemorySwap
    /// --detach-keys -> DetachKeys
    /// --add-host    -> AddHost
    /// </summary>
    public static string ToPascalCase(string cliName)
    {
        var clean = cliName.TrimStart('-');
        var parts = clean.Split('-', '_');
        return string.Concat(parts.Select(p =>
            char.ToUpperInvariant(p[0]) + p[1..]));
    }

    /// <summary>
    /// Converts a command path to a class name.
    /// ["container", "run"] + "Docker" -> DockerContainerRunCommand
    /// </summary>
    public static string ToClassName(
        string prefix, IReadOnlyList<string> path)
    {
        var segments = path.Select(p => ToPascalCase(p));
        return $"{prefix}{string.Concat(segments)}Command";
    }

    /// <summary>
    /// Converts a command group path to a group class name.
    /// ["container"] + "Docker" -> DockerContainerGroup
    /// </summary>
    public static string ToGroupName(
        string prefix, IReadOnlyList<string> path)
    {
        var segments = path.Select(p => ToPascalCase(p));
        return $"{prefix}{string.Concat(segments)}Group";
    }
}

The rules are simple: strip leading dashes, split on dashes and underscores, capitalize each segment, concatenate. --memory-swap becomes MemorySwap. --detach-keys becomes DetachKeys. docker container run becomes DockerContainerRunCommand. Predictable, reversible, and collision-free for the Docker CLI's naming patterns.

There is one edge case worth mentioning: Docker has a few options with numbers in the name (--ip6tables, --log-opt max-size). The generator handles these correctly because ToPascalCase preserves digits -- ip6tables becomes Ip6tables, which is slightly ugly but unambiguous. I considered special-casing these but decided against it: consistency beats aesthetics when the goal is predictability.


Incremental Generator Caching

I keep saying "incremental" -- let me explain why that matters and how it works.

A Roslyn source generator has two options: implement ISourceGenerator (the original API from .NET 5) or implement IIncrementalGenerator (the newer API from .NET 6). The original API re-runs the entire generator on every compilation -- every keystroke in the IDE, every file save, every build triggers full regeneration. For a generator that produces 172 files from 40 JSON inputs, that would be catastrophic. Every character typed in any .cs file would trigger a 1.8-second regeneration. The IDE would be unusable.

The incremental API changes this. It models the generator as a pipeline of transformations, and Roslyn tracks the inputs to each stage. If the inputs have not changed, the outputs are reused from cache. The critical question is: what counts as "changed"?

Diagram
The incremental cache invalidation contract — only descriptor edits or JSON input changes trigger regeneration, which is why editing any other .cs file doesn't hit the 1.8-second cold path and the IDE stays responsive.

For this generator, the inputs are:

  1. The [BinaryWrapper] attribute on the descriptor class. If someone changes the binary name, the namespace, or any configuration property, the generator needs to re-run.
  2. The AdditionalFiles JSON files. If a JSON file is added, removed, or modified, the generator needs to re-run.

That is it. Editing any other .cs file in the project -- adding a method, fixing a bug, writing a test -- does not trigger regeneration. The generated files are cached and reused. This is not a minor optimization -- it is the difference between "IDE feels normal" and "IDE lags on every keystroke."

How the caching works internally

The ForAttributeWithMetadataName call returns an IncrementalValuesProvider<DescriptorInfo>. Roslyn compares the previous DescriptorInfo with the current one using Equals(). If they match, the downstream pipeline is not triggered.

The AdditionalTextsProvider returns an IncrementalValuesProvider<AdditionalText>. Roslyn compares the file paths and content hashes. If no JSON files changed, the Collect() call returns the same ImmutableArray<AdditionalText> and the downstream pipeline is not triggered.

The Combine call merges these two signals. Both must be unchanged for the cache to hold. If either changes, the full generation pipeline runs.

I measured this on a real project. The numbers:

First build (cold):
  Generator execution: 1,847 ms
  Files emitted: 172
  Total generated lines: ~28,000

Second build (no changes):
  Generator execution: 0 ms (cached)
  Files emitted: 0 (reused from cache)

Build after editing Program.cs:
  Generator execution: 0 ms (cached)
  Files emitted: 0 (reused from cache)

Build after adding docker-28.0.0.json:
  Generator execution: 1,923 ms
  Files emitted: 172 (full regeneration)

The 0 ms on cached builds is not a rounding error -- Roslyn genuinely skips the generator entirely when inputs are unchanged. The incremental API does not call RegisterSourceOutput's callback at all. It just keeps the previous outputs in its cache and includes them in the compilation.

Making DescriptorInfo cacheable

For the caching to work, the DescriptorInfo struct must implement IEquatable<DescriptorInfo> with value semantics. This is easy to get wrong -- if you use reference equality (the default for classes), the cache never hits because Roslyn creates new instances on each compilation pass.

public readonly record struct DescriptorInfo(
    string BinaryName,
    string DisplayName,
    string Namespace,
    string ClassName,
    Accessibility Accessibility) : IEquatable<DescriptorInfo>;

Using a record struct gives us structural equality for free. Two DescriptorInfo values are equal if and only if all their fields are equal. This is exactly what Roslyn needs to determine whether the generator should re-run.

I learned this the hard way. The first version of the generator used a regular class for descriptor info. The cache never hit. Every keystroke triggered full regeneration. The IDE was glacial. Switching to a record struct fixed it instantly. This is the single most important thing to get right in an incremental generator -- if your pipeline input types do not have correct value equality, you do not have an incremental generator. You have a slow generator with extra steps.


The Full Generate Method

Now that we have covered the pieces, here is the complete Generate method that ties everything together:

private static void Generate(
    SourceProductionContext spc,
    DescriptorInfo descriptor,
    ImmutableArray<AdditionalText> allJsonFiles)
{
    // 1. Filter to matching JSON files
    var matchingFiles = allJsonFiles
        .Where(f => MatchesBinary(f.Path, descriptor.BinaryName))
        .ToList();

    if (matchingFiles.Count == 0) return;

    // 2. Read and deserialize
    var versionedTrees = new List<(SemanticVersion Version, CommandTree Tree)>();
    foreach (var file in matchingFiles)
    {
        var result = CommandTreeReader.Read(file);
        if (result is not null)
            versionedTrees.Add(result.Value);
    }

    if (versionedTrees.Count == 0) return;

    // 3. Merge all versions into unified tree
    var unified = VersionDiffer.Merge(versionedTrees);

    // 4. Emit version metadata
    EmitVersionMetadata(spc, descriptor, unified);

    // 5. Emit command classes (one per leaf command)
    CommandClassEmitter.Emit(spc, descriptor, unified);

    // 6. Emit builder classes (one per command class)
    BuilderClassEmitter.Emit(spc, descriptor, unified);

    // 7. Emit client class (one per descriptor, with nested groups)
    ClientClassEmitter.Emit(spc, descriptor, unified);
}

private static void EmitVersionMetadata(
    SourceProductionContext spc,
    DescriptorInfo descriptor,
    UnifiedCommandTree unified)
{
    var sb = new StringBuilder();
    sb.AppendLine("// <auto-generated/>");
    sb.AppendLine("#nullable enable");
    sb.AppendLine();
    sb.AppendLine($"namespace {descriptor.Namespace};");
    sb.AppendLine();
    sb.AppendLine("/// <summary>");
    sb.AppendLine($"/// Version metadata for the {descriptor.DisplayName} CLI wrapper.");
    sb.AppendLine("/// </summary>");
    sb.AppendLine($"public static class {descriptor.ClassName}Versions");
    sb.AppendLine("{");
    sb.AppendLine($"    /// <summary>Oldest tracked version.</summary>");
    sb.AppendLine($"    public static SemanticVersion Oldest => " +
        $"new({unified.OldestVersion.Major}, " +
        $"{unified.OldestVersion.Minor}, " +
        $"{unified.OldestVersion.Patch});");
    sb.AppendLine();
    sb.AppendLine($"    /// <summary>Newest tracked version.</summary>");
    sb.AppendLine($"    public static SemanticVersion Newest => " +
        $"new({unified.NewestVersion.Major}, " +
        $"{unified.NewestVersion.Minor}, " +
        $"{unified.NewestVersion.Patch});");
    sb.AppendLine();
    sb.AppendLine($"    /// <summary>Number of versions merged.</summary>");
    sb.AppendLine($"    public const int VersionCount = {unified.VersionCount};");
    sb.AppendLine("}");

    spc.AddSource(
        $"{descriptor.ClassName}Versions.g.cs",
        SourceText.From(sb.ToString(), Encoding.UTF8));
}

Seven steps: filter, read, check, merge, emit metadata, emit commands, emit builders, emit client. The entire pipeline runs inside a single RegisterSourceOutput callback. No async, no file I/O (the JSON files are already in memory via AdditionalText.GetText()), no external dependencies. Pure CPU work -- deserialize, diff, emit strings.


Generated Output Statistics

Here is what the generator produces for Docker CLI from 40 versioned JSON files:

Category Files Approximate Lines
Command classes 85 ~12,000
Builder classes 85 ~15,000
Client class (with nested groups) 1 ~800
Version metadata 1 ~200
Total 172 ~28,000

Twenty-eight thousand lines of generated C# that no one writes, no one maintains, and no one reviews. The source of truth is the 40 JSON files produced by the scraper. If Docker ships a new version with new commands, run the scraper to produce a 41st JSON file, drop it in the scrape/ directory, rebuild. The generator picks it up, merges it into the unified tree, and re-emits all 172 files with the new command and its [SinceVersion] attribute. Done.

The 85 command classes correspond to 85 leaf commands in Docker CLI. Some examples:

DockerContainerRunCommand         DockerContainerStopCommand
DockerContainerExecCommand        DockerContainerLsCommand
DockerContainerLogsCommand        DockerContainerInspectCommand
DockerImageBuildCommand           DockerImagePullCommand
DockerImagePushCommand            DockerImageLsCommand
DockerNetworkCreateCommand        DockerNetworkConnectCommand
DockerVolumeCreateCommand         DockerVolumeLsCommand
DockerBuildxBuildCommand          DockerBuildxCreateCommand
DockerScoutCvesCommand            DockerScoutQuickviewCommand
... (67 more)

Each command class has a matching builder. Each group of commands has a nested group class in the client. The entire Docker CLI surface -- every command, every flag, every version annotation -- is available through IntelliSense the moment the project compiles.


What the Generator Does Not Do

A few non-obvious decisions:

It does not validate the JSON files. If a JSON file has a malformed command tree, the reader returns null and the generator skips that version. The merge still produces a correct tree from the remaining versions. This is a robustness choice -- a single bad file should not block generation.

It does not emit runtime execution code. The generated classes are data classes -- they know how to serialize themselves to argument lists (ToArguments()), but they do not know how to spawn processes, capture output, or parse results. That is the job of the CommandExecutor in the runtime layer, which is hand-written and shared across all generated command types.

It does not generate tests. Tests for generated code are hand-written against the generated output. I considered generating test scaffolds, but the tests would be testing the generator, not the generated code, and they would change every time the JSON data changed. Instead, the test suite verifies the generator's logic (merge correctness, naming conventions, version guard placement) and spot-checks a few generated files for structural correctness.

It does not handle argument ordering. Docker does not care about flag order -- docker run --name foo --detach and docker run --detach --name foo are equivalent. The generated ToArguments() emits flags in declaration order (the order they appear in the JSON/unified tree), which is stable but arbitrary. If Docker ever introduces order-sensitive flags, this would need revisiting. So far, it has not.


Debugging the Generator

Source generators are notoriously hard to debug. They run inside the compiler, which runs inside the IDE, which means Console.WriteLine goes nowhere and breakpoints require attaching to the compiler process.

The approach I use:

1. Diagnostic reporting. The generator can emit Diagnostic objects via SourceProductionContext.ReportDiagnostic(). I have a verbose mode that reports every step:

#if GENERATOR_VERBOSE
spc.ReportDiagnostic(Diagnostic.Create(
    new DiagnosticDescriptor(
        "BW001", "BinaryWrapper",
        "Matched {0} JSON files for binary '{1}'",
        "BinaryWrapper", DiagnosticSeverity.Info, true),
    Location.None,
    matchingFiles.Count, descriptor.BinaryName));
#endif

2. Unit testing the pipeline. The VersionDiffer, CommandTreeReader, and all three emitters are testable in isolation. They take plain data in and return plain data out (or strings, in the case of emitters). No Roslyn dependency in the core logic.

3. Snapshot testing the output. For each emitter, I have a set of golden files -- expected output for a known set of input JSON files. The test deserializes the JSON, runs the merge, runs the emitter, and compares the output string against the golden file. If the output changes, the test fails and I can diff the expected vs actual to see exactly what changed. This catches regressions instantly.

[Fact]
public void CommandClassEmitter_DockerContainerRun_MatchesGolden()
{
    var tree = LoadTestTree("docker-test-fixture.json");
    var unified = VersionDiffer.Merge(new[] { tree });

    var output = new StringBuilder();
    CommandClassEmitter.EmitToString(output, TestDescriptor, unified,
        commandPath: new[] { "container", "run" });

    var expected = File.ReadAllText("golden/DockerContainerRunCommand.g.cs");
    Assert.Equal(expected, output.ToString());
}

Performance Characteristics

The generator is CPU-bound. No I/O (the JSON files are in-memory AdditionalText objects), no allocation-heavy patterns (StringBuilder reuse where possible), no LINQ in hot paths (the merge uses foreach and Dictionary lookups).

Breakdown of the 1.8-second generation time:

JSON deserialization (40 files):     ~120 ms  (  7%)
VersionDiffer.Merge (40 trees):      ~250 ms  ( 14%)
CommandClassEmitter (85 classes):    ~580 ms  ( 32%)
BuilderClassEmitter (85 builders):   ~650 ms  ( 36%)
ClientClassEmitter (1 client):       ~80 ms   (  4%)
Roslyn AddSource overhead:           ~120 ms  (  7%)
                                    ─────────
Total:                              ~1,800 ms (100%)

The emitters dominate because they are building strings. Each generated file is between 100 and 500 lines of C#, and there are 172 of them. The StringBuilder calls themselves are fast, but the sheer volume of string construction adds up.

Could this be faster? Probably. I could pool StringBuilder instances, pre-compute string templates, or emit SourceText directly from char[] spans instead of going through StringBuilder.ToString(). But 1.8 seconds on the first build and 0 seconds on every subsequent build is already fast enough. Optimizing the cold path when the hot path is free would be a waste of engineering time.


Closing

One generator. 40 JSON files. ~200 generated C# files. Under 2 seconds. And the developer never sees any of it -- they just type Docker.Container.Run() and get IntelliSense.

The source generator is the translation layer between the raw data produced by design-time scraping and the typed API consumed by application code. It reads JSON, merges versions, and emits C#. It is deterministic -- the same inputs always produce the same outputs. It is incremental -- it only re-runs when the inputs actually change. And it is self-contained -- no external tools, no build scripts, no post-processing.

Next: Part VII gives a guided tour of what the developer actually sees -- the generated commands, builders, and client. Not the generator, not the JSON, not the merge algorithm. Just the experience of typing Docker.Container.Run(b => b.WithDetach().WithName("web").WithPublish("8080:80")) and knowing that the compiler has your back.

⬇ Download