Ops.LoadTesting -- Scenarios as Types, Across Three Tiers

The Problem

Load testing has a shelf life of approximately one sprint.

Someone writes a k6 script in tests/load/order-load.js. It targets POST /api/orders with a hardcoded JSON body from March.
In April, the API changes: shippingAddress becomes a nested object. The load test script still sends the old flat structure. The API returns 400. The load test "passes" because nobody checks the error rate.
In June, the performance budget for order creation changes from 200ms to 150ms at P95. The k6 threshold still says p(95)<200. The test passes. Production is slow.
In September, the team decides to run a spike test. Nobody remembers how to configure k6 for spike patterns. Someone copies a script from Stack Overflow and changes the URL.

The root causes:

Load test scripts are disconnected from the code they test. When the API changes, the scripts do not update.
Thresholds are manually maintained. They drift from the performance budgets declared elsewhere.
There is no tier progression. Developers cannot run a fast micro-benchmark locally and then scale to 500 concurrent users in the cloud from the same definition.
Traffic patterns are ad hoc. There is no vocabulary for "spike" or "soak" or "staircase" — each script reinvents its own ramp-up logic.

Attribute Definitions

// =================================================================
// Ops.LoadTesting.Lib -- Load Testing DSL Attributes
// =================================================================

/// Declares a load test scenario.
/// The Tier determines what gets generated: BenchmarkDotNet, k6, or distributed k6.
[AttributeUsage(AttributeTargets.Class, AllowMultiple = false)]
public sealed class LoadTestAttribute : Attribute
{
    public string Name { get; }
    public OpsExecutionTier Tier { get; init; } = OpsExecutionTier.Container;
    public string Description { get; init; } = "";
    public string[] Tags { get; init; } = [];

    public LoadTestAttribute(string name) => Name = name;
}

/// Traffic shape for load generation.
public enum TrafficShape
{
    Constant,     // steady load for the entire duration
    Ramp,         // linear ramp from 0 to ConcurrentUsers
    Spike,        // steady baseline, spike to PeakMultiplier, return to baseline
    StairStep,    // incremental steps (10, 20, 30, ...) up to ConcurrentUsers
    Soak          // moderate load over extended duration (hours)
}

/// Defines how many users, for how long, and the ramp-up profile.
[AttributeUsage(AttributeTargets.Class, AllowMultiple = false)]
public sealed class LoadProfileAttribute : Attribute
{
    public int ConcurrentUsers { get; init; } = 10;
    public int RampUpSeconds { get; init; } = 30;
    public int DurationSeconds { get; init; } = 300;
    public int SteadyStateSeconds { get; init; } = 240;

    public LoadProfileAttribute() { }
}

/// Shape of the traffic and peak behavior.
[AttributeUsage(AttributeTargets.Class, AllowMultiple = false)]
public sealed class TrafficPatternAttribute : Attribute
{
    public TrafficShape Shape { get; }
    public double PeakMultiplier { get; init; } = 3.0;
    public int PeakDurationSeconds { get; init; } = 60;
    public int StepCount { get; init; } = 5;
    public int StepDurationSeconds { get; init; } = 60;

    public TrafficPatternAttribute(TrafficShape shape) => Shape = shape;
}

/// An endpoint targeted by the load test.
[AttributeUsage(AttributeTargets.Class, AllowMultiple = true)]
public sealed class LoadTestEndpointAttribute : Attribute
{
    public string HttpMethod { get; }
    public string Path { get; }
    public string PayloadGenerator { get; init; } = "";
    public string[] Headers { get; init; } = [];
    public double WeightPercent { get; init; } = 100.0;

    public LoadTestEndpointAttribute(string httpMethod, string path)
    {
        HttpMethod = httpMethod;
        Path = path;
    }
}

/// Pass/fail criteria for the load test.
/// If omitted, the generator reads from [PerformanceBudget] on the target endpoint.
[AttributeUsage(AttributeTargets.Class, AllowMultiple = true)]
public sealed class LoadTestThresholdAttribute : Attribute
{
    public string Endpoint { get; init; } = "";
    public int P95Ms { get; init; }
    public int P99Ms { get; init; }
    public double MaxErrorRate { get; init; } = 0.01;
    public int MaxP50Ms { get; init; }

    public LoadTestThresholdAttribute() { }
}

Usage: OrderService Peak Traffic -- Three Tiers

Tier 1: InProcess Micro-Benchmark

For local development. Tests individual methods, not HTTP endpoints. Runs in milliseconds.

[LoadTest("order-price-benchmark", Tier = OpsExecutionTier.InProcess,
    Description = "Micro-benchmark for order price calculation hot path")]
[LoadProfile(ConcurrentUsers = 1, DurationSeconds = 0)]  // single-threaded benchmark
public class OrderPriceBenchmark
{
    [BenchmarkTarget(nameof(OrderPriceCalculator.CalculateTotal),
        MaxDurationMs = 5, MaxAllocationsBytes = 1024,
        RegressionThresholdPercent = 15.0)]
    public void PriceCalculation() { }

    [BenchmarkTarget(nameof(OrderValidator.Validate),
        MaxDurationMs = 2, MaxAllocationsBytes = 512)]
    public void OrderValidation() { }
}

Tier 2: Container Load Test

For CI. Tests the full HTTP stack running in Docker. 50 concurrent users, spike pattern.

[LoadTest("order-api-container-load", Tier = OpsExecutionTier.Container,
    Description = "Load test against containerized OrderService API")]
[LoadProfile(
    ConcurrentUsers = 50,
    RampUpSeconds = 30,
    DurationSeconds = 300,
    SteadyStateSeconds = 240)]
[TrafficPattern(TrafficShape.Spike, PeakMultiplier = 3.0, PeakDurationSeconds = 60)]

// Endpoints with traffic distribution
[LoadTestEndpoint("POST", "/api/orders",
    PayloadGenerator = nameof(OrderLoadData.CreateOrderPayload),
    WeightPercent = 30)]
[LoadTestEndpoint("GET", "/api/orders/{id}",
    WeightPercent = 50)]
[LoadTestEndpoint("GET", "/api/orders?status=pending",
    WeightPercent = 20)]

// Thresholds — or omit and inherit from [PerformanceBudget]
[LoadTestThreshold(Endpoint = "POST /api/orders", P95Ms = 150, P99Ms = 300, MaxErrorRate = 0.01)]
[LoadTestThreshold(Endpoint = "GET /api/orders/{id}", P95Ms = 50, P99Ms = 100, MaxErrorRate = 0.001)]
[LoadTestThreshold(Endpoint = "GET /api/orders?status=pending", P95Ms = 200, P99Ms = 400, MaxErrorRate = 0.01)]

public class OrderApiContainerLoad { }

Tier 3: Cloud Distributed Load Test

For pre-release validation. 500 concurrent users across 10 VMs. Same endpoints, same thresholds.

[LoadTest("order-api-cloud-load", Tier = OpsExecutionTier.Cloud,
    Description = "Distributed load test against staging environment")]
[LoadProfile(
    ConcurrentUsers = 500,
    RampUpSeconds = 120,
    DurationSeconds = 1800,
    SteadyStateSeconds = 1500)]
[TrafficPattern(TrafficShape.StairStep,
    StepCount = 10,
    StepDurationSeconds = 180)]

[LoadTestEndpoint("POST", "/api/orders",
    PayloadGenerator = nameof(OrderLoadData.CreateOrderPayload),
    WeightPercent = 30)]
[LoadTestEndpoint("GET", "/api/orders/{id}", WeightPercent = 50)]
[LoadTestEndpoint("GET", "/api/orders?status=pending", WeightPercent = 20)]

// Same thresholds as Container — the performance contract does not change with scale
[LoadTestThreshold(Endpoint = "POST /api/orders", P95Ms = 150, P99Ms = 300)]
[LoadTestThreshold(Endpoint = "GET /api/orders/{id}", P95Ms = 50, P99Ms = 100)]

public class OrderApiCloudLoad { }

Three classes. Same endpoints, same thresholds. The only differences: concurrency, duration, traffic shape. The generator produces completely different artifacts for each tier.

Generated Artifacts

Tier 1: InProcess -- OrderBenchmarks.g.cs

// <auto-generated by Ops.LoadTesting.Generator />
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using BenchmarkDotNet.Reports;

[MemoryDiagnoser]
[SimpleJob(warmupCount: 3, iterationCount: 10)]
[RegressionValidator(baselinePath: "benchmarks/order-price-baseline.json")]
public class OrderPriceBenchmarks
{
    private OrderPriceCalculator _calculator = null!;
    private OrderValidator _validator = null!;
    private Order _testOrder = null!;

    [GlobalSetup]
    public void Setup()
    {
        _calculator = new OrderPriceCalculator();
        _validator = new OrderValidator();
        _testOrder = OrderTestData.CreateTypicalOrder();
    }

    [Benchmark]
    public decimal CalculateTotal()
    {
        return _calculator.CalculateTotal(_testOrder);
    }

    [Benchmark]
    public ValidationResult Validate()
    {
        return _validator.Validate(_testOrder);
    }
}

public static class OrderPriceBenchmarkRunner
{
    public static int Run(string[] args)
    {
        var summary = BenchmarkRunner.Run<OrderPriceBenchmarks>();
        return ValidateThresholds(summary) ? 0 : 1;
    }

    private static bool ValidateThresholds(Summary summary)
    {
        var violations = new List<string>();

        foreach (var report in summary.Reports)
        {
            var name = report.BenchmarkCase.Descriptor.WorkloadMethodDisplayInfo;

            if (name == "CalculateTotal")
            {
                if (report.ResultStatistics?.Mean > 5_000_000) // 5ms in ns
                    violations.Add($"CalculateTotal: {report.ResultStatistics.Mean / 1_000_000:F2}ms > 5ms budget");
                if (report.GcStats.GetTotalAllocatedBytes(excludeAllocationQuantumSideEffects: true) > 1024)
                    violations.Add($"CalculateTotal: allocations exceed 1024 bytes");
            }

            if (name == "Validate")
            {
                if (report.ResultStatistics?.Mean > 2_000_000) // 2ms in ns
                    violations.Add($"Validate: {report.ResultStatistics.Mean / 1_000_000:F2}ms > 2ms budget");
                if (report.GcStats.GetTotalAllocatedBytes(excludeAllocationQuantumSideEffects: true) > 512)
                    violations.Add($"Validate: allocations exceed 512 bytes");
            }
        }

        foreach (var v in violations) Console.Error.WriteLine($"THRESHOLD VIOLATION: {v}");
        return violations.Count == 0;
    }
}

Tier 2: Container -- k6-order-load.js

The complete generated k6 script. This is the full output, not a sketch.

// Auto-generated by Ops.LoadTesting.Generator
// Source: OrderApiContainerLoad
// Tier: Container
// Traffic: Spike (3.0x peak for 60s)

import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate, Trend } from 'k6/metrics';

// ── Custom Metrics ────────────────────────────────────────────
const orderCreateDuration = new Trend('order_create_duration', true);
const orderGetDuration = new Trend('order_get_duration', true);
const orderListDuration = new Trend('order_list_duration', true);
const errorRate = new Rate('errors');

// ── Options ───────────────────────────────────────────────────
export const options = {
  scenarios: {
    spike_test: {
      executor: 'ramping-vus',
      startVUs: 0,
      stages: [
        { duration: '30s',  target: 50 },   // ramp to steady state
        { duration: '120s', target: 50 },   // steady state
        { duration: '10s',  target: 150 },  // spike to 3x
        { duration: '60s',  target: 150 },  // hold spike
        { duration: '10s',  target: 50 },   // return to baseline
        { duration: '70s',  target: 50 },   // steady state
      ],
    },
  },

  thresholds: {
    'order_create_duration': [
      { threshold: 'p(95)<150', abortOnFail: true },
      { threshold: 'p(99)<300', abortOnFail: false },
    ],
    'order_get_duration': [
      { threshold: 'p(95)<50',  abortOnFail: true },
      { threshold: 'p(99)<100', abortOnFail: false },
    ],
    'order_list_duration': [
      { threshold: 'p(95)<200', abortOnFail: true },
      { threshold: 'p(99)<400', abortOnFail: false },
    ],
    'errors': [
      { threshold: 'rate<0.01', abortOnFail: true },
    ],
  },
};

// ── Payload Generator ─────────────────────────────────────────
function createOrderPayload() {
  return JSON.stringify({
    customerId: `cust-${Math.floor(Math.random() * 10000)}`,
    items: [
      {
        productId: `prod-${Math.floor(Math.random() * 500)}`,
        quantity: Math.floor(Math.random() * 5) + 1,
        unitPrice: (Math.random() * 100 + 5).toFixed(2),
      },
    ],
    shippingAddress: {
      street: '123 Load Test Ave',
      city: 'Testville',
      postalCode: '12345',
      country: 'US',
    },
  });
}

// ── Shared State ──────────────────────────────────────────────
const BASE_URL = __ENV.BASE_URL || 'http://localhost:5000';
const createdOrderIds = [];

// ── Default Function ──────────────────────────────────────────
export default function () {
  const roll = Math.random() * 100;

  if (roll < 30) {
    // POST /api/orders (30% of traffic)
    const payload = createOrderPayload();
    const res = http.post(`${BASE_URL}/api/orders`, payload, {
      headers: { 'Content-Type': 'application/json' },
    });

    orderCreateDuration.add(res.timings.duration);
    errorRate.add(res.status >= 400);

    const success = check(res, {
      'order created: status 201': (r) => r.status === 201,
      'order created: has id':     (r) => r.json('id') !== undefined,
    });

    if (success && res.json('id')) {
      createdOrderIds.push(res.json('id'));
    }
  } else if (roll < 80) {
    // GET /api/orders/{id} (50% of traffic)
    const orderId = createdOrderIds.length > 0
      ? createdOrderIds[Math.floor(Math.random() * createdOrderIds.length)]
      : 'order-seed-001';

    const res = http.get(`${BASE_URL}/api/orders/${orderId}`);

    orderGetDuration.add(res.timings.duration);
    errorRate.add(res.status >= 400);

    check(res, {
      'order get: status 200': (r) => r.status === 200,
    });
  } else {
    // GET /api/orders?status=pending (20% of traffic)
    const res = http.get(`${BASE_URL}/api/orders?status=pending`);

    orderListDuration.add(res.timings.duration);
    errorRate.add(res.status >= 400);

    check(res, {
      'order list: status 200':   (r) => r.status === 200,
      'order list: is array':     (r) => Array.isArray(r.json()),
    });
  }

  sleep(Math.random() * 2 + 0.5); // think time: 0.5-2.5s
}

// ── Summary ───────────────────────────────────────────────────
export function handleSummary(data) {
  return {
    'stdout': textSummary(data, { indent: '  ', enableColors: true }),
    'results/order-load-summary.json': JSON.stringify(data, null, 2),
  };
}

Tier 3: Cloud -- Distributed k6 on Terraform

terraform/load-test/main.tf

# Auto-generated by Ops.LoadTesting.Generator
# Source: OrderApiCloudLoad
# Tier: Cloud (500 users, 10 VMs, StairStep)

variable "target_url" {
  description = "Staging environment base URL"
  type        = string
}

variable "resource_group_name" {
  type = string
}

variable "location" {
  type    = string
  default = "westeurope"
}

resource "azurerm_container_group" "k6_runners" {
  count               = 10
  name                = "k6-order-load-${count.index}"
  location            = var.location
  resource_group_name = var.resource_group_name
  os_type             = "Linux"
  restart_policy      = "Never"

  container {
    name   = "k6"
    image  = "grafana/k6:0.49.0"
    cpu    = "2"
    memory = "4"

    commands = [
      "k6", "run",
      "--out", "influxdb=http://influx-collector:8086/k6",
      "--tag", "runner=${count.index}",
      "--env", "BASE_URL=${var.target_url}",
      "/scripts/k6-order-load.js"
    ]

    volume {
      name       = "scripts"
      mount_path = "/scripts"
      secret = {
        "k6-order-load.js" = base64encode(file("${path.module}/k6-order-load-cloud.js"))
      }
    }
  }
}

resource "azurerm_container_group" "influx_collector" {
  name                = "k6-influx-collector"
  location            = var.location
  resource_group_name = var.resource_group_name
  os_type             = "Linux"
  restart_policy      = "Never"

  container {
    name   = "influxdb"
    image  = "influxdb:2.7"
    cpu    = "2"
    memory = "8"

    ports {
      port     = 8086
      protocol = "TCP"
    }
  }
}

k6-order-load-cloud.js (StairStep variant)

The Cloud tier generates a different k6 script optimized for distributed execution:

// Auto-generated by Ops.LoadTesting.Generator
// Source: OrderApiCloudLoad -- Cloud tier (distributed across 10 VMs)

export const options = {
  scenarios: {
    stair_step: {
      executor: 'ramping-vus',
      startVUs: 0,
      stages: [
        // 10 steps x 180s each, distributed across 10 runners
        // Each runner gets 1/10 of the VUs
        { duration: '180s', target: 5 },    // step 1: 50 total
        { duration: '180s', target: 10 },   // step 2: 100 total
        { duration: '180s', target: 15 },   // step 3: 150 total
        { duration: '180s', target: 20 },   // step 4: 200 total
        { duration: '180s', target: 25 },   // step 5: 250 total
        { duration: '180s', target: 30 },   // step 6: 300 total
        { duration: '180s', target: 35 },   // step 7: 350 total
        { duration: '180s', target: 40 },   // step 8: 400 total
        { duration: '180s', target: 45 },   // step 9: 450 total
        { duration: '180s', target: 50 },   // step 10: 500 total
      ],
    },
  },

  thresholds: {
    'order_create_duration': ['p(95)<150', 'p(99)<300'],
    'order_get_duration':    ['p(95)<50',  'p(99)<100'],
    'errors':                ['rate<0.01'],
  },
};

// ... (same endpoint logic as Container tier, reads BASE_URL from env)

Analyzer Diagnostics

ID	Severity	Rule	Example
LDT001	Error	Load test references nonexistent endpoint	`[LoadTestEndpoint("GET", "/api/products")]` but no controller has `[HttpGet("/api/products")]`
LDT002	Warning	Load profile without matching PerformanceBudget	`[LoadTestEndpoint("POST", "/api/orders")]` but no `[PerformanceBudget("POST /api/orders")]` exists. Thresholds cannot be inherited.
LDT003	Warning	Cloud-tier test without resource budget	`[LoadTest("x", Tier = Cloud)]` exists but no `[CostBudget]` from the Cost DSL covers the load test infrastructure
LDT004	Error	Threshold exceeds performance budget	`[LoadTestThreshold(P95Ms = 200)]` but the `[PerformanceBudget]` for that endpoint says P95 = 150ms. The load test would pass while the budget fails.
LDT005	Info	Soak test duration under 1 hour	`[TrafficPattern(TrafficShape.Soak)]` with `DurationSeconds = 600`. Soak tests under 1 hour rarely surface memory leaks.

LDT004 is critical. It prevents the most common load testing lie: "the load test passed" when the thresholds in the test are looser than the actual performance contract.

The generator reads [PerformanceBudget] attributes from the Performance DSL to set default thresholds. If a [LoadTestEndpoint] targets POST /api/orders and a [PerformanceBudget("POST /api/orders", P95Ms = 150)] exists, the k6 threshold defaults to p(95)<150 — even if [LoadTestThreshold] is omitted.

LoadTesting to Chaos

Load tests and chaos experiments compose. A spike load test running simultaneously with a latency fault injection tests whether the circuit breaker trips before the error budget is consumed:

[LoadTest("order-chaos-spike", Tier = OpsExecutionTier.Container)]
[LoadProfile(ConcurrentUsers = 50)]
[TrafficPattern(TrafficShape.Spike, PeakMultiplier = 5.0)]
[ChaosExperiment("payment-timeout-during-spike",
    Hypothesis = "Circuit breaker trips within 10s, error rate stays below 5%")]

LoadTesting to Observability

During load tests, the generator enables additional metric collection:

Request rate per endpoint (matches [LoadTestEndpoint] distribution)
Error rate by status code
Resource utilization (CPU, memory, connections)

All metrics flow into the same Grafana dashboards declared in the Observability DSL. The load test summary links to the dashboard URL with the correct time range pre-populated.

LoadTesting to Cost

Cloud-tier load tests spin up infrastructure. The Cost DSL validates that the estimated cost of 10 VMs for 30 minutes does not exceed the test budget. If it does, the analyzer fires LDT003 and the build fails before any cloud resources are provisioned.

`[` or `Alt+S`	Focus sidebar navigation
`]` or `Alt+C`	Focus main content
`↑` `↓`	Navigate between sidebar items
`Enter`	Open page / toggle section
`Space`	Toggle section expand/collapse
`Escape`	Close overlay / sidebar

`Ctrl+K`	Open search
`?`	Show this help

`Ctrl+=` or `Ctrl+↑`	Increase font size
`Ctrl+−` or `Ctrl+↓`	Decrease font size
`f`	Open console font selector

`Ctrl+⇧+=` or `Ctrl+⇧+↑`	Browser zoom in
`Ctrl+⇧+−` or `Ctrl+⇧+↓`	Browser zoom out
`Ctrl+⇧+0`	Reset browser zoom

`Tab`	Focus a diagram or image
`Enter`	Open full size overlay
`+` `−`	Zoom in / out (in overlay)
`Escape`	Close overlay, return focus

Ops.LoadTesting -- Scenarios as Types, Across Three Tiers

The Problem

Attribute Definitions

Usage: OrderService Peak Traffic -- Three Tiers

Tier 1: InProcess Micro-Benchmark

Tier 2: Container Load Test

Tier 3: Cloud Distributed Load Test

Generated Artifacts

Tier 1: InProcess -- OrderBenchmarks.g.cs

Tier 2: Container -- k6-order-load.js

Tier 3: Cloud -- Distributed k6 on Terraform

terraform/load-test/main.tf

k6-order-load-cloud.js (StairStep variant)

Analyzer Diagnostics

Cross-DSL Integration

LoadTesting to Performance

LoadTesting to Chaos

LoadTesting to Observability

LoadTesting to Cost

Ops.LoadTesting -- Scenarios as Types, Across Three Tiers📋

The Problem📋

Attribute Definitions📋

Usage: OrderService Peak Traffic -- Three Tiers📋

Tier 1: InProcess Micro-Benchmark📋

Tier 2: Container Load Test📋

Tier 3: Cloud Distributed Load Test📋

Generated Artifacts📋

Tier 1: InProcess -- OrderBenchmarks.g.cs📋

Tier 2: Container -- k6-order-load.js📋

Tier 3: Cloud -- Distributed k6 on Terraform📋

terraform/load-test/main.tf📋

k6-order-load-cloud.js (StairStep variant)📋

Analyzer Diagnostics📋

Cross-DSL Integration📋

LoadTesting to Performance📋

LoadTesting to Chaos📋

LoadTesting to Observability📋

LoadTesting to Cost📋

Ops.LoadTesting -- Scenarios as Types, Across Three Tiers

The Problem

Attribute Definitions

Usage: OrderService Peak Traffic -- Three Tiers

Tier 1: InProcess Micro-Benchmark

Tier 2: Container Load Test

Tier 3: Cloud Distributed Load Test

Generated Artifacts

Tier 1: InProcess -- OrderBenchmarks.g.cs

Tier 2: Container -- k6-order-load.js

Tier 3: Cloud -- Distributed k6 on Terraform

terraform/load-test/main.tf

k6-order-load-cloud.js (StairStep variant)

Analyzer Diagnostics

Cross-DSL Integration

LoadTesting to Performance

LoadTesting to Chaos

LoadTesting to Observability

LoadTesting to Cost