Skip to main content
Welcome. This site supports keyboard navigation and screen readers. Press ? at any time for keyboard shortcuts. Press [ to focus the sidebar, ] to focus the content. High-contrast themes are available via the toolbar.
serard@dev00:~/cv

Ops.Deployment -- Orchestration as a DAG

"If the deployment order is in a wiki, it is already wrong."


The Problem

Every team has a deployment checklist. It lives in Confluence, a Slack pinned message, or a README that was last updated three quarters ago. It says things like:

## Order Service v2.4 Deployment

1. Make sure Payment Service v1.2 is already deployed
2. Run migration 47 (see migration doc)
3. Deploy order-service (3 replicas, rolling update)
4. Deploy order-worker (1 replica)
5. Wait 10 minutes, check Grafana
6. If errors > 5%, kubectl rollout undo

The problems:

  • No validation. Step 1 says "make sure" -- make sure how? Nobody checks.
  • No ordering guarantee. Step 2 before step 3 is convention, not enforcement.
  • No dependency tracking. Payment Service v1.2 might itself depend on SharedInfra v8.0, but nobody wrote that down.
  • No cross-referencing. The required configs (PaymentGateway:ApiKey, ConnectionStrings:OrderDb) are not verified against what actually exists in the Configuration DSL.
  • No strategy enforcement. "Rolling update" is a kubectl flag someone types manually. If they type Recreate by accident, the service goes down.

What we need: a typed, compile-time validated directed acyclic graph of deployment steps. An attribute on a class. A source generator that builds the DAG, verifies ordering, and emits infrastructure artifacts. An analyzer that catches errors before dotnet build succeeds.


The Meta-Primitives

Before diving into the Deployment-specific attributes, recall the 8 shared meta-primitives that all 22 Ops sub-DSLs inherit from:

Primitive Role in Deployment
OpsTarget What gets deployed (app, service, infrastructure)
OpsProbe Health check endpoints referenced by deployed apps
OpsThreshold Error rate thresholds that trigger rollback
OpsPolicy Deployment policies (approval gates, change windows)
OpsEnvironment Target environments (staging, production, dr-region)
OpsSchedule Maintenance windows, deployment blackout periods
OpsExecutionTier InProcess / Container / Cloud
OpsRequirementLink Traceability back to the requirement that motivated this deployment

The Deployment DSL builds on top of these. DeploymentApp extends OpsTarget. DeploymentOrchestrator composes OpsPolicy and OpsSchedule. The tier determines which artifacts the source generator emits.


Attribute Definitions

// =================================================================
// Ops.Deployment.Lib -- Deployment Orchestration DSL
// =================================================================

/// The top-level container for a versioned deployment procedure.
/// One class = one deployment. The class name becomes the DAG node ID.
[AttributeUsage(AttributeTargets.Class)]
public sealed class DeploymentOrchestratorAttribute : Attribute
{
    public string Version { get; }
    public string Description { get; init; } = "";
    public DeploymentStrategy Strategy { get; init; } = DeploymentStrategy.Rolling;
    public string[] TargetEnvironments { get; init; } = ["staging", "production"];
    public string ChangeWindow { get; init; } = "";         // cron expression
    public bool RequiresApproval { get; init; } = false;
    public int MaxParallelApps { get; init; } = int.MaxValue;

    public DeploymentOrchestratorAttribute(string version) => Version = version;
}

public enum DeploymentStrategy
{
    Rolling,       // replace pods one at a time, zero downtime
    BlueGreen,     // provision new env, swap traffic atomically
    Canary,        // gradual traffic shift with metric gates
    Recreate       // stop all old, start all new (accepts downtime)
}

/// Declare an application that participates in this deployment.
/// Each app maps to a container, a process, or a cloud resource.
[AttributeUsage(AttributeTargets.Class, AllowMultiple = true)]
public sealed class DeploymentAppAttribute : Attribute
{
    public string AppName { get; }
    public string Runtime { get; init; } = "dotnet";
    public int Replicas { get; init; } = 2;
    public string[] RequiredConfigs { get; init; } = [];
    public string Image { get; init; } = "";               // container image
    public string CpuRequest { get; init; } = "100m";
    public string MemoryRequest { get; init; } = "256Mi";
    public string CpuLimit { get; init; } = "500m";
    public string MemoryLimit { get; init; } = "512Mi";
    public string HealthCheckEndpoint { get; init; } = "/health";
    public int ReadinessDelaySeconds { get; init; } = 10;

    public DeploymentAppAttribute(string appName) => AppName = appName;
}

/// Declare a dependency between deployment orchestrators.
/// The Source Generator uses these edges to build the DAG.
[AttributeUsage(AttributeTargets.Class, AllowMultiple = true)]
public sealed class DeploymentDependencyAttribute : Attribute
{
    public Type DependsOn { get; }
    public DependencyKind Kind { get; init; } = DependencyKind.MustCompleteBefore;

    public DeploymentDependencyAttribute(Type dependsOn) => DependsOn = dependsOn;
}

public enum DependencyKind
{
    MustCompleteBefore,   // hard ordering -- wait for completion before starting
    CanRunInParallel,     // no ordering constraint, can overlap
    SoftDependency        // preferred ordering, not enforced -- warning if violated
}

/// Declare a pre-deployment gate that must pass before this orchestrator runs.
[AttributeUsage(AttributeTargets.Class, AllowMultiple = true)]
public sealed class DeploymentGateAttribute : Attribute
{
    public string Name { get; }
    public GateKind Kind { get; init; } = GateKind.HealthCheck;
    public string Target { get; init; } = "";
    public string Timeout { get; init; } = "00:05:00";

    public DeploymentGateAttribute(string name) => Name = name;
}

public enum GateKind
{
    HealthCheck,        // upstream service must be healthy
    MigrationComplete,  // referenced migration must have run
    ManualApproval,     // human clicks "approve"
    MetricThreshold     // a metric must be below threshold
}

Usage Example

A real deployment: Order Service v2.4 with three apps, two upstream dependencies, a canary strategy, and gates.

// -- OrderServiceV24Deployment.cs -----------------------------------

[DeploymentOrchestrator("2.4",
    Description = "Order service v2.4 -- adds payment status tracking",
    Strategy = DeploymentStrategy.Canary,
    TargetEnvironments = ["staging", "production"],
    ChangeWindow = "0 2 * * 1-5",   // weekdays 2 AM
    RequiresApproval = true)]
[DeploymentApp("order-api",
    Runtime = "dotnet",
    Replicas = 3,
    Image = "registry.internal/order-api:2.4.0",
    RequiredConfigs = ["ConnectionStrings:OrderDb", "PaymentGateway:ApiKey"],
    CpuRequest = "200m", MemoryRequest = "512Mi",
    CpuLimit = "1000m", MemoryLimit = "1Gi",
    HealthCheckEndpoint = "/health/ready",
    ReadinessDelaySeconds = 15)]
[DeploymentApp("order-worker",
    Runtime = "dotnet",
    Replicas = 1,
    Image = "registry.internal/order-worker:2.4.0",
    RequiredConfigs = ["ConnectionStrings:OrderDb", "RabbitMq:ConnectionString"])]
[DeploymentApp("order-migrator",
    Runtime = "dotnet",
    Replicas = 1,
    Image = "registry.internal/order-migrator:2.4.0",
    RequiredConfigs = ["ConnectionStrings:OrderDb"])]
[DeploymentDependency(typeof(PaymentServiceV12Deployment),
    Kind = DependencyKind.MustCompleteBefore)]
[DeploymentDependency(typeof(SharedInfraV8Deployment),
    Kind = DependencyKind.CanRunInParallel)]
[DeploymentGate("payment-health",
    Kind = GateKind.HealthCheck,
    Target = "payment-service:/health")]
[DeploymentGate("migration-47",
    Kind = GateKind.MigrationComplete,
    Target = "OrderDb:47")]
public sealed class OrderServiceV24Deployment { }

[DeploymentOrchestrator("1.2",
    Description = "Payment service v1.2 -- adds refund support",
    Strategy = DeploymentStrategy.BlueGreen)]
[DeploymentApp("payment-service", Replicas = 2)]
public sealed class PaymentServiceV12Deployment { }

[DeploymentOrchestrator("8.0",
    Description = "Shared infrastructure -- message broker + cache")]
[DeploymentApp("rabbitmq", Runtime = "docker", Replicas = 3)]
[DeploymentApp("redis", Runtime = "docker", Replicas = 2)]
public sealed class SharedInfraV8Deployment { }

That is the entire deployment specification. Fourteen attributes on three classes. The wiki page with its 27 steps, outdated screenshots, and broken links is replaced by code that compiles.


InProcess Tier: DeploymentDag.g.cs

The source generator reads every [DeploymentOrchestrator] in the compilation, builds the dependency graph, performs topological sort, and emits a static class with the validated execution plan.

// <auto-generated by Ops.Deployment.Generators />
namespace Ops.Deployment.Generated;

public static class DeploymentDag
{
    public static readonly IReadOnlyList<DeploymentNode> Nodes =
    [
        new("OrderServiceV24Deployment", "2.4", DeploymentStrategy.Canary,
            Apps:
            [
                new("order-api", "dotnet", 3, ["ConnectionStrings:OrderDb", "PaymentGateway:ApiKey"]),
                new("order-worker", "dotnet", 1, ["ConnectionStrings:OrderDb", "RabbitMq:ConnectionString"]),
                new("order-migrator", "dotnet", 1, ["ConnectionStrings:OrderDb"]),
            ],
            DependsOn:
            [
                new("PaymentServiceV12Deployment", DependencyKind.MustCompleteBefore),
                new("SharedInfraV8Deployment", DependencyKind.CanRunInParallel),
            ],
            Gates:
            [
                new("payment-health", GateKind.HealthCheck, "payment-service:/health"),
                new("migration-47", GateKind.MigrationComplete, "OrderDb:47"),
            ]),

        new("PaymentServiceV12Deployment", "1.2", DeploymentStrategy.BlueGreen,
            Apps: [new("payment-service", "dotnet", 2, [])],
            DependsOn: [], Gates: []),

        new("SharedInfraV8Deployment", "8.0", DeploymentStrategy.Rolling,
            Apps:
            [
                new("rabbitmq", "docker", 3, []),
                new("redis", "docker", 2, []),
            ],
            DependsOn: [], Gates: []),
    ];

    /// Topological execution order -- deploy in this sequence.
    /// Nodes at the same level can run in parallel.
    public static readonly IReadOnlyList<ExecutionWave> ExecutionPlan =
    [
        new(Level: 0, Nodes: ["SharedInfraV8Deployment", "PaymentServiceV12Deployment"]),
        new(Level: 1, Nodes: ["OrderServiceV24Deployment"]),
    ];

    /// Validate the DAG at startup (optional runtime check).
    public static ValidationResult Validate()
    {
        var errors = new List<string>();

        // 1. Verify no circular dependencies
        // 2. Verify all DependsOn targets resolve to known orchestrators
        // 3. Verify all RequiredConfigs are satisfied by Configuration DSL
        // 4. Verify all gates reference reachable targets
        // 5. Verify health check endpoints exist in Observability DSL

        return new(errors.Count == 0, errors);
    }
}

public sealed record DeploymentNode(
    string Name, string Version, DeploymentStrategy Strategy,
    IReadOnlyList<AppInfo> Apps,
    IReadOnlyList<DependencyEdge> DependsOn,
    IReadOnlyList<GateInfo> Gates);

public sealed record AppInfo(
    string AppName, string Runtime, int Replicas,
    IReadOnlyList<string> RequiredConfigs);

public sealed record DependencyEdge(string Target, DependencyKind Kind);
public sealed record GateInfo(string Name, GateKind Kind, string Target);
public sealed record ExecutionWave(int Level, IReadOnlyList<string> Nodes);
public sealed record ValidationResult(bool IsValid, IReadOnlyList<string> Errors);

The InProcess tier also registers a DI decorator so you can inject IDeploymentDag in integration tests and verify deployment ordering programmatically:

// <auto-generated by Ops.Deployment.Generators />
namespace Ops.Deployment.Generated;

public static class DeploymentServiceCollectionExtensions
{
    public static IServiceCollection AddDeploymentDag(this IServiceCollection services)
    {
        services.AddSingleton<IDeploymentDag>(DeploymentDag.Instance);
        return services;
    }
}

Container Tier: docker-compose.deploy.yaml

When the execution tier is Container, the generator emits a Docker Compose file that mirrors the deployment topology:

# <auto-generated by Ops.Deployment.Generators />
# Deployment: OrderServiceV24Deployment v2.4

version: "3.8"

services:
  order-api:
    image: registry.internal/order-api:2.4.0
    deploy:
      replicas: 3
      resources:
        limits:
          cpus: "1.0"
          memory: 1G
        reservations:
          cpus: "0.2"
          memory: 512M
      restart_policy:
        condition: on-failure
        max_attempts: 3
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:80/health/ready"]
      interval: 10s
      timeout: 5s
      retries: 3
      start_period: 15s
    environment:
      - ConnectionStrings__OrderDb=${ORDERDB_CONNECTION_STRING}
      - PaymentGateway__ApiKey=${PAYMENT_GATEWAY_API_KEY}
    depends_on:
      order-migrator:
        condition: service_completed_successfully
    networks:
      - order-network

  order-worker:
    image: registry.internal/order-worker:2.4.0
    deploy:
      replicas: 1
    environment:
      - ConnectionStrings__OrderDb=${ORDERDB_CONNECTION_STRING}
      - RabbitMq__ConnectionString=${RABBITMQ_CONNECTION_STRING}
    depends_on:
      order-migrator:
        condition: service_completed_successfully
    networks:
      - order-network

  order-migrator:
    image: registry.internal/order-migrator:2.4.0
    deploy:
      replicas: 1
      restart_policy:
        condition: none
    environment:
      - ConnectionStrings__OrderDb=${ORDERDB_CONNECTION_STRING}
    networks:
      - order-network

networks:
  order-network:
    driver: bridge

Container Tier: k8s-deployment.yaml

The same attributes also produce Kubernetes manifests:

# <auto-generated by Ops.Deployment.Generators />
# Deployment: OrderServiceV24Deployment v2.4 (Canary)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-api
  labels:
    app: order-api
    version: "2.4"
    ops.deployment/orchestrator: OrderServiceV24Deployment
    ops.deployment/strategy: canary
spec:
  replicas: 3
  selector:
    matchLabels:
      app: order-api
  template:
    metadata:
      labels:
        app: order-api
        version: "2.4"
    spec:
      containers:
        - name: order-api
          image: registry.internal/order-api:2.4.0
          resources:
            requests:
              cpu: 200m
              memory: 512Mi
            limits:
              cpu: 1000m
              memory: 1Gi
          readinessProbe:
            httpGet:
              path: /health/ready
              port: 80
            initialDelaySeconds: 15
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /health/live
              port: 80
            initialDelaySeconds: 30
            periodSeconds: 15
          envFrom:
            - configMapRef:
                name: order-api-config
            - secretRef:
                name: order-api-secrets
---
apiVersion: v1
kind: Service
metadata:
  name: order-api
spec:
  selector:
    app: order-api
  ports:
    - port: 80
      targetPort: 80

Cloud Tier: terraform/deployment/main.tf

At the Cloud tier, the generator emits Terraform with provider-specific resources:

# <auto-generated by Ops.Deployment.Generators />
# Deployment: OrderServiceV24Deployment v2.4

resource "kubernetes_deployment" "order_api" {
  metadata {
    name      = "order-api"
    namespace = var.namespace
    labels = {
      app     = "order-api"
      version = "2.4"
    }
  }

  spec {
    replicas = 3

    selector {
      match_labels = {
        app = "order-api"
      }
    }

    strategy {
      type = "RollingUpdate"
      rolling_update {
        max_surge       = "25%"
        max_unavailable = "0"
      }
    }

    template {
      metadata {
        labels = {
          app     = "order-api"
          version = "2.4"
        }
      }

      spec {
        container {
          name  = "order-api"
          image = "registry.internal/order-api:2.4.0"

          resources {
            requests = {
              cpu    = "200m"
              memory = "512Mi"
            }
            limits = {
              cpu    = "1000m"
              memory = "1Gi"
            }
          }

          readiness_probe {
            http_get {
              path = "/health/ready"
              port = 80
            }
            initial_delay_seconds = 15
            period_seconds        = 10
          }
        }
      }
    }
  }
}

# Argo Rollout for canary strategy
resource "kubectl_manifest" "order_api_rollout" {
  yaml_body = yamlencode({
    apiVersion = "argoproj.io/v1alpha1"
    kind       = "Rollout"
    metadata = {
      name      = "order-api-canary"
      namespace = var.namespace
    }
    spec = {
      replicas = 3
      strategy = {
        canary = {
          steps = [
            { setWeight = 10 },
            { pause = { duration = "5m" } },
            { setWeight = 30 },
            { pause = { duration = "5m" } },
            { setWeight = 60 },
            { pause = { duration = "5m" } },
            { setWeight = 100 },
          ]
          canaryService  = "order-api-canary"
          stableService  = "order-api-stable"
        }
      }
      selector = {
        matchLabels = {
          app = "order-api"
        }
      }
      template = {
        metadata = {
          labels = {
            app     = "order-api"
            version = "2.4"
          }
        }
        spec = {
          containers = [{
            name  = "order-api"
            image = "registry.internal/order-api:2.4.0"
          }]
        }
      }
    }
  })
}

Analyzer Diagnostics

The analyzers run at compile time. They see the full compilation context -- every [DeploymentOrchestrator], every [DeploymentApp], every [DeploymentDependency] -- and cross-reference against other Ops DSLs.

OPS001: Circular Deployment Dependency

[DeploymentOrchestrator("1.0")]
[DeploymentDependency(typeof(ServiceBDeployment))]
public sealed class ServiceADeployment { }

[DeploymentOrchestrator("1.0")]
[DeploymentDependency(typeof(ServiceADeployment))]
public sealed class ServiceBDeployment { }

// error OPS001: Circular deployment dependency detected:
//   ServiceADeployment -> ServiceBDeployment -> ServiceADeployment
//   Break the cycle by changing one dependency to SoftDependency
//   or CanRunInParallel.

The analyzer performs a full cycle detection on the DAG. For longer cycles (A -> B -> C -> D -> A), it reports the entire path so you can identify the weakest link.

OPS002: App Without Health Check

[DeploymentOrchestrator("3.0")]
[DeploymentApp("inventory-service", Replicas = 2)]
public sealed class InventoryV3Deployment { }

// warning OPS002: DeploymentApp 'inventory-service' has no corresponding
//   [HealthCheck] in the Observability DSL. Deployment cannot verify
//   app readiness after rollout. Add [HealthCheck("inventory-service", ...)]
//   to enable readiness verification.

This is a cross-DSL diagnostic. The Deployment analyzer scans the compilation for [HealthCheck] attributes (from Ops.Observability.Lib) and matches them to [DeploymentApp] names. If an app has no health check, the deployment has no way to verify that the new version started correctly.

OPS003: RequiredConfigs Not Satisfied

[DeploymentOrchestrator("2.0")]
[DeploymentApp("billing-service",
    RequiredConfigs = ["Stripe:SecretKey", "Stripe:WebhookUrl", "Database:BillingDb"])]
public sealed class BillingV2Deployment { }

// Assume Configuration DSL only has transforms for:
//   "Stripe:SecretKey" and "Database:BillingDb"

// error OPS003: DeploymentApp 'billing-service' requires config
//   'Stripe:WebhookUrl' but no [ConfigTransform] for this key exists
//   in the compilation. Add [ConfigTransform("Stripe:WebhookUrl", ...)]
//   or remove the config from RequiredConfigs.

The analyzer checks every entry in RequiredConfigs against [ConfigTransform] and [Secret] attributes from the Configuration DSL. If a config is declared as required but has no transform or secret definition, the build fails. This catches the "works in staging, crashes in production" scenario where someone forgot to add the new config key to the environment matrix.


Cross-DSL Integration

The Deployment DSL does not operate in isolation. It is one node in a network of 22 sub-DSLs that share meta-primitives and cross-reference each other.

Deployment to Observability

Every [DeploymentApp] should have a matching [HealthCheck] in the Observability DSL. The Deployment generator looks for health checks and wires them into the generated Kubernetes readiness probes and Docker health checks. If the Observability DSL declares a [Metric("order_api_request_latency_seconds", ...)], the Canary strategy references it as a promotion gate.

// In the Observability DSL
[HealthCheck("order-api",
    Endpoint = "/health/ready",
    Timeout = "00:00:05",
    Retries = 3)]
public void OrderApiHealthCheck() { }

// The Deployment generator discovers this and wires it into:
// - k8s readinessProbe (Kubernetes tier)
// - Docker HEALTHCHECK (Container tier)
// - Argo Rollout analysis template (Cloud tier)

Deployment to Configuration

RequiredConfigs on [DeploymentApp] are verified against [ConfigTransform] attributes. The generated Docker Compose maps them to environment variables. The Kubernetes manifests reference ConfigMaps and Secrets. The Terraform output creates SSM parameters or Key Vault secrets.

Deployment to Capacity

When a Capacity DSL (one of the 22 sub-DSLs) declares autoscale rules, the Deployment generator incorporates them:

// In the Capacity DSL
[AutoScale("order-api",
    MinReplicas = 2, MaxReplicas = 10,
    CpuThresholdPercent = 70,
    ScaleUpCooldown = "00:03:00")]
public void OrderApiScaling() { }

// The Deployment generator merges this into:
// - HPA resource in Kubernetes manifests
// - aws_appautoscaling_target in Terraform
// - Replicas field becomes the initial count, not the fixed count

Deployment to Migration

The [DeploymentGate] with Kind = GateKind.MigrationComplete references a specific migration step. The generator verifies that the referenced migration exists in the Migration DSL and that its order is consistent with the deployment DAG.


What Changed

Before: a wiki page that nobody reads, a kubectl command that somebody types, a Slack message that says "done."

After: dotnet build produces the DAG, validates ordering, cross-references configs, emits Kubernetes manifests, generates Argo Rollout canary steps, and fails the build if anything is inconsistent.

The deployment specification is code. It compiles. It is versioned. It is reviewed in pull requests. It cannot drift from reality because reality is generated from it.

That is what "orchestration as a DAG" means. Not a picture in a wiki. A topological sort in a source generator.

⬇ Download