Comparison -- Terraform, Helm, Pulumi, and Ops.Dsl
This is not a hit piece on Terraform. Terraform is excellent at what it does. The question is what it does versus what the Ops DSL approach does, and where each belongs.
The Comparison Matrix
| Dimension | Terraform | Helm | Pulumi | Ops.Dsl |
|---|---|---|---|---|
| Language | HCL | Go templates + YAML | TypeScript / Python / Go / C# | C# attributes |
| Primary purpose | Infrastructure provisioning | Kubernetes manifest templating | Infrastructure as real code | Operational knowledge specification |
| State management | terraform.tfstate (remote backend) |
Helm release history (etcd) | State backend (S3, Azure Blob, etc.) | Compiler (stateless) |
| Validation | terraform plan |
helm lint + helm template |
Type checker (TypeScript, etc.) | Roslyn analyzers (compile-time) |
| Cross-resource validation | Limited (depends-on, count) | None (templates are independent) | Limited (within same stack) | Full (cross-DSL graph across 22 DSLs) |
| Operational knowledge | No | No | No | Yes (SLOs, chaos, compliance, incident, lifecycle...) |
| Generated from domain types | No | No | No | Yes (domain model to deployment) |
| Feedback loop | terraform plan (seconds to minutes) |
helm template (fast) |
Compile + plan (seconds to minutes) | dotnet build (milliseconds to seconds) |
| Requires cloud credentials | Yes (to plan and apply) | Yes (to install, needs kubeconfig) | Yes (to preview and up) | No (generates files, does not apply them) |
| Learning curve for a C# dev | New language (HCL) | New templating (Go templates + YAML) | Familiar if using C# provider | Zero new syntax (C# attributes) |
Terraform: Infrastructure Provisioning
Terraform talks to cloud provider APIs. It creates VPCs, databases, load balancers, DNS records, and IAM policies. Its state file tracks what exists in the cloud. terraform plan shows what will change. terraform apply makes it happen.
resource "aws_ecs_service" "order_service" {
name = "order-service"
cluster = aws_ecs_cluster.main.id
task_definition = aws_ecs_task_definition.order.arn
desired_count = 3
load_balancer {
target_group_arn = aws_lb_target_group.order.arn
container_name = "order-api"
container_port = 8080
}
}resource "aws_ecs_service" "order_service" {
name = "order-service"
cluster = aws_ecs_cluster.main.id
task_definition = aws_ecs_task_definition.order.arn
desired_count = 3
load_balancer {
target_group_arn = aws_lb_target_group.order.arn
container_name = "order-api"
container_port = 8080
}
}This is infrastructure provisioning. It answers: "What cloud resources exist?"
Helm: Kubernetes Templating
Helm takes Go templates and produces Kubernetes YAML. It manages releases, handles upgrades and rollbacks, and supports values files for environment-specific configuration.
# templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Values.service.name }}
spec:
replicas: {{ .Values.replicas }}
template:
spec:
containers:
- name: {{ .Values.service.name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
ports:
- containerPort: {{ .Values.service.port }}# templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Values.service.name }}
spec:
replicas: {{ .Values.replicas }}
template:
spec:
containers:
- name: {{ .Values.service.name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
ports:
- containerPort: {{ .Values.service.port }}This is manifest templating. It answers: "What should Kubernetes run?"
Pulumi: Infrastructure as Real Code
Pulumi uses real programming languages to define infrastructure. TypeScript, Python, Go, or C#. You get type checking, IDE support, and the full power of a general-purpose language.
const service = new aws.ecs.Service("order-service", {
cluster: cluster.id,
taskDefinition: taskDef.arn,
desiredCount: 3,
loadBalancers: [{
targetGroupArn: targetGroup.arn,
containerName: "order-api",
containerPort: 8080,
}],
});const service = new aws.ecs.Service("order-service", {
cluster: cluster.id,
taskDefinition: taskDef.arn,
desiredCount: 3,
loadBalancers: [{
targetGroupArn: targetGroup.arn,
containerName: "order-api",
containerPort: 8080,
}],
});This is infrastructure as real code. It answers the same question as Terraform ("What cloud resources exist?") but in a language you already know.
Ops.Dsl: Operational Knowledge Specification
The Ops DSL does not talk to cloud APIs. It does not create resources. It does not manage state. It declares operational knowledge and generates artifacts.
[DeploymentApp("order-service",
Image = "order-api",
Port = 8080,
Replicas = 3)]
[CircuitBreaker("payment-gateway",
FailureThreshold = 5,
BreakDuration = "60s")]
[ServiceLevelObjective("order-api-latency",
Target = 99.5,
Window = "30d",
Threshold = "500ms")]
[ChaosExperiment("payment-timeout",
FaultKind = FaultKind.Timeout,
TargetService = "payment-gateway")][DeploymentApp("order-service",
Image = "order-api",
Port = 8080,
Replicas = 3)]
[CircuitBreaker("payment-gateway",
FailureThreshold = 5,
BreakDuration = "60s")]
[ServiceLevelObjective("order-api-latency",
Target = 99.5,
Window = "30d",
Threshold = "500ms")]
[ChaosExperiment("payment-timeout",
FaultKind = FaultKind.Timeout,
TargetService = "payment-gateway")]This is operational knowledge. It answers: "How should this service behave, and what happens when things go wrong?"
The Key Distinction
Terraform, Helm, and Pulumi provision infrastructure. They create the resources that your service runs on.
Ops.Dsl specifies operational behavior. It declares how the service should be deployed, monitored, tested, secured, scaled, and recovered. It then generates the artifacts that Terraform, Helm, and Pulumi consume.
They are complementary, not competitive. Ops.Dsl sits above them in the abstraction stack:
Layer 4: Ops.Dsl (operational knowledge in C# attributes)
|
| generates
v
Layer 3: Terraform modules, Helm values, Pulumi programs
|
| provisions / templates
v
Layer 2: Cloud APIs, Kubernetes API
|
| creates
v
Layer 1: Running infrastructure (VMs, containers, databases)Layer 4: Ops.Dsl (operational knowledge in C# attributes)
|
| generates
v
Layer 3: Terraform modules, Helm values, Pulumi programs
|
| provisions / templates
v
Layer 2: Cloud APIs, Kubernetes API
|
| creates
v
Layer 1: Running infrastructure (VMs, containers, databases)When you write [ContainerSpec("order-api", CpuRequest = "250m", CpuLimit = "1000m")], the source generator emits a Terraform aws_ecs_task_definition resource with the matching CPU limits, a Kubernetes Deployment manifest with the matching resource requests, and a Docker Compose service with the matching resource constraints. One attribute, three output formats, three tiers.
What Terraform Cannot Do
Terraform is excellent at provisioning. It is not designed for operational knowledge. Consider what Terraform cannot express:
Cross-resource semantic validation. Terraform can check that a security group exists before an EC2 instance references it. It cannot check that a service with a circuit breaker has a corresponding chaos experiment. It cannot check that an SLO threshold is consistent with the performance budget for each endpoint. It cannot check that the autoscale max replicas times the container CPU limit stays within the cost budget.
These are not infrastructure checks. They are operational knowledge checks. They require understanding the relationships between concepts from different domains: resilience, performance, cost, chaos engineering. Terraform's dependency graph tracks resource creation order, not operational semantics.
Domain-aware generation. Terraform does not know what an [AggregateRoot] is. It cannot generate a Kubernetes Deployment from a DDD attribute. It cannot link a domain event to a Prometheus metric to an alert rule to an escalation policy. These cross-domain links are the core value of the Ops DSL approach.
InProcess tier. Terraform operates at the cloud tier. It creates infrastructure. It does not generate C# decorators, DI registrations, middleware, or health check implementations. The InProcess tier -- where a developer runs dotnet test and gets chaos injection, circuit breakers, and performance budgets without Docker or Kubernetes -- does not exist in Terraform's world.
Compile-time feedback. terraform plan requires cloud credentials and network access. It takes seconds to minutes. Roslyn analyzers run during dotnet build, require nothing external, and take milliseconds. The feedback loop difference matters: a developer who gets an analyzer warning while typing is more likely to fix it than a developer who discovers the issue during a CI pipeline ten minutes later.
What Ops.Dsl Cannot Do
For fairness:
Ops.Dsl does not provision infrastructure. It generates Terraform modules, but it does not run terraform apply. It generates Kubernetes manifests, but it does not run kubectl apply. The deployment pipeline still needs Terraform, Helm, or Pulumi (or plain kubectl) to apply the generated artifacts to real infrastructure.
Ops.Dsl does not manage state. Terraform's state file tracks what exists. If someone manually deletes a resource, terraform plan detects the drift. Ops.Dsl has no state file because it does not interact with cloud APIs. Drift detection between the generated artifacts and the actual infrastructure is the responsibility of the deployment pipeline, not the DSL.
Ops.Dsl does not replace cloud-specific features. Terraform providers support thousands of resource types across dozens of cloud providers. Ops.Dsl generates a subset of those resource types -- the ones that map to operational concerns. If you need a specific AWS Glue job configuration or a particular Azure Cognitive Services setup, you write Terraform (or Pulumi) for that. Ops.Dsl covers the operational cross-cutting concerns, not every possible cloud resource.
Why Terraform Does Not Belong on a Dev Machine
This is an opinion, stated clearly.
Terraform talks to cloud APIs. On a developer's workstation, cloud API calls mean:
- Cloud credentials on the developer machine. A security concern. Every developer has access to create and destroy infrastructure.
- Cost per developer per experiment. Every
terraform applycreates real resources that cost real money. A developer experimenting with autoscale configurations creates real autoscalers. - Shared state conflicts. Two developers running
terraform applyagainst the same state file corrupt it. Remote state with locking mitigates this but adds complexity. - Network dependency.
terraform planrequires internet access to the cloud provider API. On an airplane, in a coffee shop with bad WiFi, on a VPN that routes cloud traffic through a corporate proxy -- Terraform does not work.
The Ops DSL three-tier model addresses this directly:
- InProcess tier (developer machine): DI decorators, health checks, policies.
dotnet buildanddotnet test. No Docker, no cloud, no credentials. - Container tier (CI or local Docker): Docker Compose, Prometheus, Toxiproxy.
docker compose up. No cloud credentials. - Cloud tier (deployment pipeline): Terraform modules, Kubernetes manifests. Applied by the pipeline with pipeline credentials. Not by developers.
The developer works at the InProcess and Container tiers. The deployment pipeline works at the Cloud tier. The Ops DSL generates artifacts for all three. The developer never needs terraform on their PATH.
The Convergence: Pulumi and Ops.Dsl
Pulumi is the closest existing tool to the Ops DSL approach. It uses real programming languages. It has type checking. The C# provider lets you define infrastructure in C#. The gap is narrower.
But the gap still exists:
Pulumi defines infrastructure. Ops.Dsl defines operational behavior. Pulumi's C# program creates an ECS task definition. Ops.Dsl's C# attributes declare an SLO, a chaos experiment, a circuit breaker, and an escalation policy -- and also generate the ECS task definition as one of many outputs.
Pulumi does not cross-validate operational concerns. A Pulumi program can create both an autoscaler and a budget alert, but it does not validate that the autoscaler's max replicas times the container CPU limit stays within the budget. That validation requires understanding the semantic relationship between scaling and cost -- which is a domain-specific concern that Pulumi's type system does not encode.
Pulumi does not integrate with domain models. A Pulumi program does not know about [AggregateRoot("Order")] or [Feature("OrderCancellation")]. The domain model and the infrastructure definition are separate codebases. The Ops DSL approach puts them in the same compilation unit so the source generator can cross-reference them.
The convergence direction is clear. Pulumi brought real languages to infrastructure. The Ops DSL approach adds the missing pieces:
- Domain awareness. Cross-references to DDD entities, content types, workflow states, and requirement features.
- Operational knowledge. SLOs, chaos experiments, compliance frameworks, incident management -- concepts that are not infrastructure but are essential to operating infrastructure.
- Compile-time validation. Roslyn analyzers that validate relationships between operational concerns at the speed of typing, not at the speed of
pulumi preview.
When to Use What
Use Terraform when:
- You have a dedicated infrastructure team that manages cloud resources
- Your infrastructure is provisioned independently of application code
- You need multi-cloud provisioning with a single tool
- Your operational knowledge lives in wikis, runbooks, and tribal memory (the status quo)
Use Helm when:
- You deploy to Kubernetes and need templated manifests
- Your deployment pipeline expects Helm charts
- You need Helm's release management (upgrade, rollback, history)
- Your teams already know Helm's templating
Use Pulumi when:
- Your development team provisions infrastructure (dev teams owning their infra)
- You want type safety in infrastructure definitions
- You want to use the same language for application code and infrastructure
- You are comfortable with Pulumi's state management
Use Ops.Dsl when:
- You want operational knowledge compiled into the application, not documented beside it
- You want cross-DSL validation (resilience + chaos + performance + cost + compliance)
- You want three-tier progression (develop locally with InProcess, test with Container, deploy with Cloud)
- You want to generate Terraform, Helm, and Kubernetes manifests from typed attributes
- You want domain-aware operations (DDD to deployment, requirements to chaos, content to GDPR)
Use Ops.Dsl WITH Terraform/Helm/Pulumi when:
- Ops.Dsl generates the Terraform modules, Helm values, or Pulumi programs
- The deployment pipeline applies the generated artifacts using Terraform, Helm, or Pulumi
- The developer works at InProcess and Container tiers
- The pipeline works at Cloud tier
- Nobody writes Terraform by hand for operational cross-cutting concerns -- the DSL generates it
The Honest Assessment
Terraform has a decade of ecosystem maturity. Thousands of providers. Hundreds of thousands of modules. A massive community. Production battle-testing at every scale.
Helm is the de facto standard for Kubernetes packaging. Every CNCF project ships a Helm chart. Every Kubernetes tutorial uses Helm.
Pulumi has real-language infrastructure with a growing ecosystem and enterprise adoption.
Ops.Dsl has none of that. It is an architecture. It is a design. It is a series of blog posts describing what the attributes, generators, and analyzers look like. The ecosystem does not exist yet in the way that Terraform's ecosystem exists.
The question is not "which is better today." The question is "which abstraction is correct." Terraform's abstraction is cloud resources. Helm's abstraction is Kubernetes templates. Pulumi's abstraction is infrastructure as code.
The Ops DSL abstraction is operational knowledge as types.
If operational knowledge -- SLOs, chaos experiments, escalation policies, compliance controls, data governance rules, cost budgets, capacity plans, incident management procedures -- belongs in the type system, then the Ops DSL approach is the correct abstraction. Terraform, Helm, and Pulumi become the output layer, not the specification layer.
The specification layer is where errors are cheapest to catch. An analyzer warning during dotnet build costs zero. A terraform plan failure during CI costs minutes. A production incident caused by an unconfigured escalation policy costs hours, money, and customer trust.
The further left you move the specification, the cheaper the errors. The Ops DSL moves operational specifications all the way to the compiler. That is the thesis. Everything in this series is evidence for it.