Skip to main content
Welcome. This site supports keyboard navigation and screen readers. Press ? at any time for keyboard shortcuts. Press [ to focus the sidebar, ] to focus the content. High-contrast themes are available via the toolbar.
serard@dev00:~/cv

Incremental Generator Performance

GitLab.Ci.Yaml ships ~30 types across 11 versions = 330 type-versions. Kubernetes.Dsl v0.4 ships ~600 types across 5 K8s minors + ~150 CRD types across ~3 tags = ~3,450 type-versions. Ten times the load. Without explicit mitigations, the source generator will dominate dotnet build time and make the developer experience miserable.

This chapter is about the mitigations: API-group filtering at the attribute level, per-type emission for Roslyn deduplication, UnifiedSchema caching keyed by SHA-256, and the phased delivery slices that let users opt into smaller working sets.

The numbers

Slice API groups Types Generated .g.cs files Schema files Cold-start SG time (target)
v0.1 core/v1, apps/v1, networking.k8s.io/v1 ~80 ~160 (POCO + builder) ~20 JSON < 200 ms
v0.2 + rbac/v1, batch/v1, autoscaling/v2, policy/v1 ~140 ~280 ~35 JSON < 400 ms
v0.3 + 7 CRD bundles ~290 ~580 ~50 JSON + ~25 YAML < 700 ms
v0.4 + Full core API (storage, coordination, events, discovery, etc.) ~600 ~1200 ~80 JSON + ~25 YAML < 1.5 s
v0.5 + Multi-version [SinceVersion]/[UntilVersion] across 1.27 → 1.31 + CRD bundle tag histories ~600 (annotations multiply, type count doesn't) ~1200 ~250 JSON + ~80 YAML < 3 s

These are cold-start numbers — first build with no Roslyn cache. Warm builds (the SG cache hits) are ~10× faster: a single edit to user code re-runs the SG in under 100 ms even at v0.5 because nothing in the schema input changed.

Mitigation 1: API-group filtering at the attribute level

The [KubernetesBundle] attribute lets users opt into only the groups they need. This is the single most important mitigation. A team that only deploys Pod, Service, Deployment, and Ingress opts into ~80 types at v0.1 and never pays for the v0.4 surface they don't use.

// Lean: 80 types, ~200 ms cold-start
[assembly: KubernetesBundle(
    Groups = "core/v1, apps/v1, networking.k8s.io/v1",
    KubernetesVersion = "1.31")]

// Full: 600 types, ~1.5 s cold-start
[assembly: KubernetesBundle(
    Groups = "core/v1, apps/v1, networking.k8s.io/v1, rbac/v1, batch/v1, " +
             "autoscaling/v2, policy/v1, storage.k8s.io/v1, coordination.k8s.io/v1, " +
             "events.k8s.io/v1, discovery.k8s.io/v1",
    KubernetesVersion = "1.31",
    Crds = new[] { "argo-rollouts", "prometheus-operator", "keda", "cert-manager",
                   "gatekeeper", "istio", "litmus" })]

The SG reads the Groups attribute first, then filters <AdditionalFiles> to only the schema files matching the requested groups. Files outside the filter are never parsed.

Inside the SG:

private static IEnumerable<AdditionalText> FilterByBundleConfig(
    IEnumerable<AdditionalText> allFiles,
    KubernetesBundleConfig config)
{
    var allowedGroups = config.Groups
        .Split(',', StringSplitOptions.TrimEntries | StringSplitOptions.RemoveEmptyEntries)
        .ToHashSet(StringComparer.OrdinalIgnoreCase);

    foreach (var file in allFiles)
    {
        var path = file.Path.Replace('\\', '/');

        // Core schemas: schemas/k8s/{minor}/apis.{group}.json or schemas/k8s/{minor}/api.v1.json
        if (TryParseCoreGroup(path, out var coreGroup, out var minor))
        {
            if (minor != config.KubernetesVersion) continue;
            if (!allowedGroups.Contains(coreGroup)) continue;
            yield return file;
            continue;
        }

        // CRDs: schemas/crds/{bundle}/{tag}/*.yaml
        if (TryParseCrdBundle(path, out var bundleName, out var tag))
        {
            if (!config.Crds.Contains(bundleName, StringComparer.OrdinalIgnoreCase)) continue;
            yield return file;
            continue;
        }

        // Local CRDs: schemas/crds/local/*.yaml
        if (path.Contains("/schemas/crds/local/"))
        {
            yield return file;
            continue;
        }
    }
}

The filter runs in the Initialize callback's Where clause, so Roslyn never asks the SG to re-process files that aren't part of the bundle.

Mitigation 2: Per-type AddSource for Roslyn deduplication

Each generated POCO and each generated builder is its own AddSource call:

foreach (var type in unified.Types)
{
    var pocoSource = OpenApiV3SchemaEmitter.EmitPoco(type);
    spc.AddSource($"Models/{type.FullPath}.g.cs", SourceText.From(pocoSource, Encoding.UTF8));

    var builderModel = BuilderHelper.CreateModel(type);
    var builderSource = BuilderEmitter.Emit(builderModel);
    spc.AddSource($"Builders/{type.FullPath}Builder.g.cs", SourceText.From(builderSource, Encoding.UTF8));
}

Roslyn hashes each SourceText and skips downstream re-compilation of generated files whose content hasn't changed. This means: if you bump the K8s minor in a way that changes V1Deployment but not V1Pod, only Apps.V1.Deployment.g.cs and Apps.V1.DeploymentBuilder.g.cs get recompiled. The other ~1198 files are untouched.

The alternative (a single monolithic AddSource("Models.g.cs", ...) containing all types) would force Roslyn to re-compile every generated type on every schema change. Per-type emission is the difference between a 100 ms incremental and a 5 s incremental.

Mitigation 3: UnifiedSchema caching keyed by SHA-256

The merger (Part 5) is the most expensive single phase. Mitigations:

// SchemaInputReader produces a content-addressed cache key
public static (JsonNode Node, string Hash) ReadSchemaWithHash(
    AdditionalText file, CancellationToken ct)
{
    var text = file.GetText(ct)?.ToString() ?? string.Empty;
    var hash = ComputeSha256(text);
    var node = ReadSchemaInternal(text, file.Path);
    return (node, hash);
}

// SchemaVersionMerger memoizes by content hash
private static readonly ConditionalWeakTable<object, UnifiedType> _typeCache = new();

public static UnifiedSchema MergeCore(
    IReadOnlyList<(string Path, JsonNode Node, string Hash)> coreSchemas,
    KubernetesBundleConfig config)
{
    var combinedHash = ComputeCombinedHash(coreSchemas.Select(s => s.Hash));
    if (TryGetCached(combinedHash, out var cached))
        return cached;

    var unified = MergeCoreInternal(coreSchemas, config);
    SetCached(combinedHash, unified);
    return unified;
}

Two things to notice:

  1. The cache key is the combined SHA-256 of all input files. If a single byte changes in a single schema, the key changes and the merger re-runs. Otherwise it returns the cached result.
  2. The cache lives in a static ConditionalWeakTable keyed by Roslyn's SourceText instances. Roslyn collects unused entries when it compacts its internal cache.

Mitigation 4: Lazy property merging

MergeProperties is the inner loop of the merger. It only walks the properties of types that survived the API-group filter. A type that's not requested doesn't have its properties walked. This sounds obvious but it's the difference between merging 600 types' worth of properties (~12,000 walks) and merging 80 types' worth (~1,600 walks) on the v0.1 slice.

Mitigation 5: OpenAPI v3 split over v2 monolith

K8s publishes both an old v2 swagger monolith (~30 MB) and the new v3 split (~50 files of ~600 KB each). The downloader fetches the v3 split because parsing 50 small files in parallel beats parsing one 30 MB blob, and because the split lets API-group filtering happen at file granularity instead of inside the parser.

The fallback to v2 only kicks in for very old K8s minors (1.16 and earlier) that didn't ship the v3 split. None of the supported minors (1.27 → 1.31) need it.

Benchmark numbers (target)

These are committed targets, measured on a Ryzen 7 5800X with NVMe SSD running .NET 9.0 in Release mode:

Slice Cold-start SG time Warm-start SG time (no schema change) Warm-start (1 schema file changed)
v0.1 (~80 types, k8s 1.31) 180 ms 25 ms 60 ms
v0.2 (~140 types, k8s 1.31) 350 ms 35 ms 90 ms
v0.3 (~290 types incl. 7 CRD bundles) 650 ms 50 ms 130 ms
v0.4 (~600 types, full core) 1.4 s 90 ms 220 ms
v0.5 (~600 types, multi-version 1.27→1.31, CRD tag histories) 2.8 s 110 ms 280 ms

The warm-start numbers are what matters for daily development. A typical edit-build-test cycle changes user code, not schemas, so the SG hits the cache and finishes in ~100 ms. The cold-start numbers matter for CI builds and clean checkouts.

For comparison: GitLab.Ci.Yaml cold-start is ~80 ms (30 types × 11 versions = 330 type-versions). Kubernetes.Dsl v0.5 is ~35× slower for ~10× more output, which is the expected scaling: most of the cost is the merger and the per-type emission, both of which scale linearly.

Phased delivery commitment

The series ships with v0.1 working and the rest as roadmap. Part 6 cites v0.1 file counts. Part 14 (composition walkthrough) uses v0.3.

Slice Status Used by
v0.1 First release Most teams (Pod/Service/Deployment/Ingress is 90% of usage)
v0.2 Second release RBAC, batch jobs, HPA
v0.3 Third release Ops.Dsl Cloud-tier targets (Argo, Prometheus Operator, KEDA, etc.)
v0.4 Stretch Parity with KubernetesClient/csharp
v0.5 Stretch Multi-version cluster compatibility (1.27 → 1.31)

What does not help

These were considered and rejected:

  • Pre-emitting .g.cs files into a NuGet package. Removes the SG entirely but loses customization (Groups filter, TargetClusterCompatibility, in-house CRDs). Ships ~12 MB of .cs files that the consuming compiler still has to parse and bind. Net effect: same Roslyn cost minus the SG flexibility. Not worth it.
  • Compiling generated code ahead-of-time into a separate assembly. Same trade-off, plus the user can't override generated types with their own partial extensions.
  • Switching to a custom incremental compiler. The Roslyn analyzer host is the right place for this. Going around it loses IDE integration and the diagnostic story.

What you measure when you measure SG perf

Three numbers worth tracking, all reported by dotnet build /bl and inspected with BinLogReader:

  1. Time spent in IIncrementalGenerator.Initialize — should be ~0 ms because it just registers providers.
  2. Time spent in RegisterSourceOutput callbacks — this is the actual emission time. The numbers in the table above are this metric.
  3. Time spent in RegisterImplementationSourceOutput — Kubernetes.Dsl doesn't use this; it's only relevant for SGs that emit code that depends on full semantic analysis (most don't).

The fourth number worth tracking is total dotnet build wall time, but that mixes SG time with C# compilation time, NuGet restore, MSBuild graph traversal, etc. The SG time is the controllable part.

When the perf budget breaks

If a future release needs to ship more than ~3,450 type-versions (e.g., v1.0 adds 5 more K8s minors and a dozen more CRD bundles), the next mitigation tier is:

  • Parallel emission across types. Parallel.ForEach(unified.Types, type => spc.AddSource(...)) is safe because each AddSource call is independent. Roslyn doesn't currently parallelize this internally.
  • Schema preprocessing in the downloader. Move the OpenAPI v3 split from "what the SG sees" to "what the downloader writes," collapsing related groups into single combined JSONs. Cuts SG file count.
  • A second SG project for CRDs only. Lets users opt into core types and CRDs separately at the project reference level, not just the attribute level. Smaller per-project working set.

None of these are needed today. The v0.5 budget covers the foreseeable scope.


Previous: Part 7: K8s YAML Serialization — Multi-Doc, Discriminator, Round-Trip Next: Part 9: CRDs as First-Class Citizens

⬇ Download