Skip to main content
Welcome. This site supports keyboard navigation and screen readers. Press ? at any time for keyboard shortcuts. Press [ to focus the sidebar, ] to focus the content. High-contrast themes are available via the toolbar.
serard@dev00:~/cv

Validation: Quality Gates vs Roslyn Analyzers

Validation is the moment of truth. Everything before — requirements, specifications, implementations, tests — leads to this question: is the system correct? Both approaches answer this question, but they answer it at different times, in different ways, with different consequences.


When You Find Out

The single most important difference in validation is timing:

Spec-Driven Timeline:
                                                              
  Developer     AI Agent       Code       Tests      CI/CD    Quality
  writes PRD → reads PRD → generates → run tests → pipeline → gate
                                                              ↑
                                                              HERE
                                                              you find out

Typed Specification Timeline:

  Developer      Compiler     Developer    Compiler    Tests    CI/CD
  adds AC → fires REQ101 → adds spec → fires CS0535 → pass → pipeline
            ↑                            ↑
            HERE                         HERE
            you find out                 you find out

With spec-driven, validation happens at the end of the pipeline — after code is generated, after tests are written, after CI runs. The feedback loop is: write → generate → test → deploy → check → fix → repeat.

With typed specifications, validation happens continuously — every time the compiler runs. The feedback loop is: change type → compiler fires → fix → compiler happy. There is no "check at the end." The compiler checks at every step.

Why Timing Matters

A bug found at compile time costs seconds to fix. A bug found at the CI quality gate costs minutes. A bug found in production costs hours or days. The same defect, found at different times, has radically different costs.

The spec-driven quality gate catches defects after code generation and testing. This is late — the AI has already generated code, the developer has already reviewed it, the tests have already been written. If the quality gate fails, everything downstream must be redone.

The typed specification analyzer catches defects the moment the type changes. The developer hasn't written any code yet. The AI agent hasn't generated anything. The compiler simply says "this AC has no spec" — and the developer creates the spec before writing a single line of implementation.

This is the difference between prevention and detection. Typed specifications prevent defects by making them impossible to introduce. Spec-driven quality gates detect defects after they've been introduced.


The Spec-Driven Quality Gate System

The spec-driven framework defines quality gates at multiple pipeline stages:

Pre-Commit Gates

Quality Gate: Pre-Commit
  Checks:
    - Fast unit tests pass
    - Linting rules pass
    - No secrets in code
  Failure action: Block commit

Commit Gates

Quality Gate: Commit
  Checks:
    - Comprehensive unit tests pass
    - Integration tests pass
    - Code coverage > 80% (line), > 75% (branch), > 90% (function)
    - Coding practices validated (SOLID, DRY)
  Failure action: Block merge

Pre-Deployment Gates

Quality Gate: Pre-Deployment
  Checks:
    - E2E tests pass
    - Security scan clean
    - Performance tests within budget (p95 < 2s)
    - Mutation score > 80%
  Failure action: Block deployment

Post-Deployment Gates

Quality Gate: Post-Deployment
  Checks:
    - Smoke tests pass
    - Error rate < 1%
    - Response time within SLA
  Failure action: Rollback

Strengths

  1. Progressive validation. Each stage catches different defect types. Fast feedback for simple issues (lint), deeper feedback for complex issues (mutation testing), production validation for operational issues (error rate).

  2. Multi-dimensional. Quality is measured across multiple dimensions: correctness (tests), completeness (coverage), robustness (mutation), security (vulnerability scan), performance (load test), reliability (error rate).

  3. Language-agnostic. The quality gate definitions work for any language. The tools change (pytest vs jest vs cargo test), but the gate structure is universal.

  4. Industry-standard. This is how most CI/CD pipelines work. Teams adopting the spec-driven approach can use existing tooling (GitHub Actions, GitLab CI, Jenkins) without custom infrastructure.

Weaknesses

  1. Post-hoc. Every gate fires after the defect is introduced. The pre-commit gate fires after the developer writes code. The commit gate fires after the developer pushes. The pre-deployment gate fires after the branch is ready for merge. At every stage, the fix requires going back.

  2. Coarse granularity. "Code coverage > 80%" doesn't tell you which acceptance criteria are untested. "Tests pass" doesn't tell you which requirements are verified. The gates measure proxies, not the actual question ("is every requirement implemented and tested?").

  3. Gameable. A developer can reach 80% line coverage by testing easy methods thoroughly while leaving hard methods untested. The gate passes, but the important code is unverified.

  4. No structural link. The gate checks that tests exist and pass, but doesn't know which tests cover which requirements. A passing quality gate is a necessary condition for correctness, not a sufficient one.


The Roslyn Analyzer System

The typed approach defines four analyzer families that fire during compilation (for a concrete example of how these analyzer families extend to operational concerns, see Auto-Documentation from a Typed System, Part VII):

REQ1xx: Requirement Coverage

Scans the Requirements project for feature types with abstract AC methods. Scans the Specifications project for [ForRequirement] attributes. Reports features and ACs that have no specification.

error REQ100: UserRolesFeature has 3 acceptance criteria but no ISpec interface
              references it via [ForRequirement(typeof(UserRolesFeature))]

error REQ101: UserRolesFeature.RoleChangeTakesEffectImmediately has no matching
              spec method with [ForRequirement(typeof(UserRolesFeature),
              nameof(UserRolesFeature.RoleChangeTakesEffectImmediately))]

warning REQ102: AssignRoleStory has no specification (acceptable for small stories)

info REQ103: UserRolesFeature — all ACs fully specified ✓

REQ2xx: Specification Implementation

Scans the Specifications project for interfaces with [ForRequirement]. Scans the Domain project for implementing classes. Reports unimplemented specifications.

error REQ200: IUserRolesSpec is not implemented by any class in MyApp.Domain

warning REQ201: AuthorizationService implements IUserRolesSpec but is missing
                [ForRequirement(typeof(UserRolesFeature))] on the class

warning REQ202: AuthorizationService.VerifyImmediateRoleEffect implements
                IUserRolesSpec but is missing method-level [ForRequirement]

info REQ203: IUserRolesSpec — fully implemented ✓

REQ3xx: Test Coverage

Scans the Tests project for [TestsFor] and [Verifies] attributes. Cross-references with feature types. Reports untested features and ACs.

error REQ300: JwtRefreshStory has 2 ACs but no test class with [TestsFor]

warning REQ301: UserRolesFeature.ViewerHasReadOnlyAccess has no test with [Verifies]

warning REQ302: OldTests.StaleTest references nameof(UserRolesFeature.DeletedAC)
                which no longer exists

info REQ303: UserRolesFeature — all ACs fully tested ✓

REQ4xx: Quality Gates (Post-Test)

Integrates with MSBuild to validate test execution results:

error REQ400: UserRolesFeature test pass rate is 87% (minimum: 100%)

warning REQ401: PasswordResetFeature AC coverage is 75% (threshold: 80%)

warning REQ402: OrderProcessingTests.LargeOrderTest took 12.3s (budget: 5s)

info REQ403: UserRolesFeature — all quality gates passed ✓

Strengths

  1. Pre-hoc. REQ1xx, REQ2xx, and REQ3xx fire during compilation — before any code runs. The developer sees the diagnostic in the IDE, in real time, as they type. The fix is immediate: add the missing spec, implement the method, write the test.

  2. Specific. Each diagnostic names the exact feature, the exact AC, and the exact action needed. Not "coverage is low" but "this specific method on this specific feature has no test."

  3. Ungameable. You can't satisfy REQ301 by testing other ACs. The diagnostic is per-AC. Either ViewerHasReadOnlyAccess has a [Verifies] test, or it doesn't. No amount of testing other methods makes this diagnostic go away.

  4. Configurable severity. Each diagnostic can be configured via .editorconfig:

# Strict mode: all diagnostics are errors
dotnet_diagnostic.REQ100.severity = error
dotnet_diagnostic.REQ101.severity = error
dotnet_diagnostic.REQ200.severity = error
dotnet_diagnostic.REQ300.severity = error
dotnet_diagnostic.REQ301.severity = error

# Relaxed mode: some diagnostics are warnings
dotnet_diagnostic.REQ102.severity = suggestion
dotnet_diagnostic.REQ202.severity = suggestion
  1. IDE integration. Diagnostics appear as squiggly underlines in the IDE. Hover shows the message. Ctrl+. offers code fixes. The experience is identical to built-in C# diagnostics — no separate tool to run.

Weaknesses

  1. C# only. Roslyn analyzers are a .NET technology. If your codebase includes Python microservices, Go workers, or React frontends, those components can't benefit from REQ1xx-REQ4xx.

  2. Requirement coverage only. The analyzers track requirement-to-test linkage, but they don't measure line coverage, branch coverage, mutation scores, or performance. You still need traditional coverage tools for those dimensions.

  3. Setup cost. Writing Roslyn analyzers is non-trivial. Each analyzer family requires understanding the Roslyn API, syntax trees, semantic models, and diagnostic reporting. This is a significant investment.

  4. False confidence — solvable. A [Verifies] test that passes but tests the wrong thing satisfies the basic REQ3xx analyzer. But this weakness has a three-layer solution: (1) make ACs executable static methods that tests must call directly, (2) add a REQ305 analyzer that checks the test body invokes the referenced AC method, (3) use mutation testing via [MutationTarget] to verify the test actually kills mutants in the AC. See Part VIII: The AI Agent Experience for the full treatment. The key insight: the AC is not just a name — it's an executable method. The test must call it. The analyzer verifies the call. Mutation testing verifies the assertion.


The Validation Timeline Compared

Let's trace a defect through both systems:

Defect: Missing implementation for an acceptance criterion

Spec-driven:

Day 1: PRD updated with new AC
Day 2: AI agent reads PRD, generates code — misses the new AC
Day 3: Developer reviews generated code — doesn't notice the gap
Day 4: Tests written for existing ACs — new AC not tested
Day 5: CI pipeline runs — all tests pass, coverage at 82%
Day 5: Quality gate passes — nothing flags the missing AC
Day 15: QA manually tests the feature — discovers the gap
Day 16: Developer implements the missing AC
Day 17: New tests written and merged

Total time from defect introduction to fix: 15 days.

Typed specifications:

Minute 0: Developer adds new AC to feature record
Minute 0: Compiler fires REQ101 — "no spec method for this AC"
Minute 5: Developer adds spec method to interface
Minute 5: Compiler fires CS0535 — "class doesn't implement interface method"
Minute 15: Developer implements the method
Minute 15: Compiler fires REQ301 — "no test for this AC"
Minute 30: Developer writes test
Minute 30: Build succeeds — all diagnostics clear

Total time from defect introduction to fix: 30 minutes.

The difference is not incremental. It's categorical. The defect never existed in the typed approach — the compiler prevented it from being introduced.


Can They Coexist?

Yes. And arguably they should.

The spec-driven quality gates cover dimensions that Roslyn analyzers don't: line coverage, mutation testing, load testing, security scanning, performance budgets. The Roslyn analyzers cover the dimension that quality gates don't: per-AC requirement coverage.

A combined validation pipeline:

Stage 1: Compile-Time (Roslyn Analyzers)
  ├── REQ1xx: Every requirement has a specification
  ├── REQ2xx: Every specification has an implementation
  └── REQ3xx: Every AC has a test

Stage 2: Test Execution
  ├── All tests pass
  └── REQ4xx: Pass rate and duration budgets

Stage 3: Post-Test Quality Gates (Spec-Driven)
  ├── Line coverage > 80%
  ├── Branch coverage > 75%
  ├── Mutation score > 80%
  ├── Security scan clean
  └── Performance within SLA

Stage 4: Pre-Deployment
  ├── E2E tests pass
  └── Load test within budget

This gives you the best of both: compile-time requirement enforcement AND post-test multi-dimensional quality measurement. The analyzers prevent structural defects (missing implementations, missing tests). The quality gates catch qualitative defects (weak tests, poor coverage, slow performance).


The Severity Configuration Question

Both approaches allow configuring validation strictness. The spec-driven approach does it through threshold values:

coverage_threshold: 80      # Adjustable per project
mutation_score_minimum: 80  # Adjustable per team
flakiness_rate_maximum: 0.01

The typed approach does it through .editorconfig severity levels:

# Per-project severity configuration
[*.cs]
dotnet_diagnostic.REQ100.severity = error      # Missing spec: fail build
dotnet_diagnostic.REQ301.severity = warning    # Missing test: warn only
dotnet_diagnostic.REQ102.severity = suggestion # Small story no spec: hint

The key difference: spec-driven configures thresholds (continuous values), typed specifications configure severities (discrete levels). A threshold says "80% is enough." A severity says "this diagnostic is an error / warning / suggestion / none." There's no "80% of ACs need specs" in the typed approach — either an AC has a spec (diagnostic clear) or it doesn't (diagnostic fires).

This reflects a deeper philosophical difference. Spec-driven says: "some imperfection is acceptable — 80% coverage is fine." Typed specifications say: "every individual gap is reported — you decide which ones to fix by configuring severity."


The Auto-Fix Difference

The spec-driven Testing-as-Code specification defines "Auto Fix Options" for each practice:

DEFINE_PRACTICE(test_organization)
  Violation Patterns:
    - unclear_test_names
    - mixed_responsibilities
    - poor_grouping
  Auto Fix Options:
    - RenameTests
    - GroupByFunction
    - ClarifyPurpose

These are aspirational: the spec describes what an auto-fix would do, but the framework doesn't implement them. They're suggestions for tooling that could be built.

The typed approach provides Roslyn code fixes — actual IDE integration that modifies code:

// Analyzer detects: REQ301 on UserRolesFeature.AdminCanRevokeRoles
// Code fix offers: "Generate test stub for AdminCanRevokeRoles"

// Clicking the code fix inserts:
[Verifies(typeof(UserRolesFeature), nameof(UserRolesFeature.AdminCanRevokeRoles))]
public void Admin_can_revoke_roles()
{
    // TODO: implement test
    throw new NotImplementedException();
}

The code fix is real, executable, and integrated into the IDE. It's not a suggestion in a document — it's a button the developer clicks. This reduces the friction between "diagnostic fires" and "fix applied" to a single keystroke.


Summary

Dimension Spec-Driven Quality Gates Roslyn Analyzers
Timing Post-hoc (after code generation, after tests) Pre-hoc (during compilation)
Granularity Threshold-based (80% coverage) Per-diagnostic (specific AC, specific feature)
Gameability Gameable (test easy methods to hit threshold) Not gameable (per-AC enforcement)
Dimensions Multi (line, branch, mutation, security, perf) Single (requirement coverage)
Language Any (language-agnostic gates) C# only (Roslyn)
Tooling Standard CI/CD Custom analyzers + code fixes
Auto-fix Aspirational (described, not implemented) Real (IDE code fix actions)
Configuration Thresholds (continuous) Severities (discrete)
Best for Multi-dimensional quality measurement Requirement-to-test enforcement
Cost Low (standard tooling) High (custom analyzer development)

Part VII examines how each approach handles documentation — another domain where the philosophical gap produces visible consequences.


Compilation as Continuous Integration

Here's a provocative framing: in a typed specification system, every keystroke is a CI run.

The traditional CI model works like this: a developer writes code locally, pushes to a remote branch, CI picks up the change, runs builds, runs tests, runs quality gates, and reports back in 5-30 minutes. The developer context-switches to another task. When CI fails, they must reload the mental context of the original change.

The Roslyn analyzer model works differently. The compiler runs continuously in the IDE. Every time you type a character, the analyzer re-evaluates. Diagnostics appear in real time — red squiggles, warning icons, info messages. There is no push, no wait, no context switch.

The Timeline Comparison

Consider a developer adding a new acceptance criterion to OrderProcessingFeature:

Spec-Driven Timeline (with CI quality gates):

 0:00  Developer adds AC text to PRD document
 0:02  Developer saves file
 0:03  Developer starts writing implementation code
 0:25  Developer writes tests
 0:35  Developer pushes to branch
 0:36  CI pipeline starts
 0:37  ├── Checkout + restore packages (30s)
 0:38  ├── Build (45s)
 0:39  ├── Unit tests (60s)
 0:40  ├── Integration tests (120s)
 0:42  ├── Coverage analysis (30s)
 0:43  ├── Quality gate evaluation (15s)
 0:43  CI reports: "Coverage dropped to 74% (threshold: 80%)"
       └── But WHICH AC is untested? Unknown. The gate says "coverage is low."
 0:43  Developer must figure out what's missing
 0:50  Developer adds missing test
 0:55  Developer pushes again
 1:00  CI passes
       ───────────────────────────────────────────
       Total feedback cycle: ~60 minutes
       Number of context switches: 2 (push → wait → return)
       Specificity of feedback: Low ("coverage is 74%")


Typed Specification Timeline (with Roslyn analyzers):

 0:00  Developer adds abstract AC method to feature record
 0:00  IDE shows: REQ101 — "no spec method for OrderCanBeCancelledByCustomer"
       ↑ instant feedback, in the same editor, at the cursor position
 0:03  Developer adds spec method to interface
 0:03  IDE shows: CS0535 — "OrderService does not implement ICancelSpec.CancelOrder"
 0:08  Developer implements the method
 0:08  IDE shows: REQ301 — "no test with [Verifies] for OrderCanBeCancelledByCustomer"
 0:15  Developer writes test
 0:15  IDE shows: all diagnostics clear ✓
 0:16  Developer pushes (CI runs for additional validation: mutation, load, security)
       ───────────────────────────────────────────
       Total feedback cycle: ~16 minutes
       Number of context switches: 0 (never left the IDE)
       Specificity of feedback: Maximum (exact AC, exact action needed)

The time difference (60 minutes vs 16 minutes) is significant, but the real difference is cognitive load. The spec-driven developer must maintain a mental model of what they've done, push, wait, interpret a coarse report, map the report back to specific changes, and fix. The typed specification developer never leaves the editor — the compiler guides them through the chain step by step.

Validation Frequency

Validation events per hour:

Spec-Driven (CI quality gates):
  ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░█
  0                                                    1
  (one validation per push, ~1-2 pushes per hour)

Typed Specifications (Roslyn analyzers):
  █████████████████████████████████████████████████████
  0         100         200         300        ~360+
  (one validation per keystroke, ~6 per second while typing)

This isn't an incremental improvement in validation frequency. It's a categorical difference. The spec-driven approach validates 1-2 times per hour. The typed approach validates hundreds of times per hour. Every single one of those validations catches the same classes of defects — missing specs, missing implementations, missing tests, stale references.

The "Shift Left" Taken to Its Logical Conclusion

The software industry talks about "shift left" — catch defects earlier. CI was a shift left from manual QA. Pre-commit hooks were a shift left from CI. The Roslyn analyzer approach shifts all the way left: to the moment the developer types the code.

There is no earlier point at which validation can happen. The code doesn't exist before the developer types it. The instant it exists — even as an incomplete line in the editor — the analyzer evaluates it. This is the theoretical limit of shift-left.

Shift-Left Progression:

Manual QA ──→ CI Pipeline ──→ Pre-Commit Hook ──→ IDE Analyzer
(days)        (minutes)       (seconds)           (milliseconds)
                                                       ↑
                                                  You are here
                                                  (typed specs)

When CI Still Matters

This doesn't make CI obsolete. Roslyn analyzers check structural correctness — the requirement chain. CI quality gates check dimensions that analyzers cannot:

  • Mutation testing — does the test actually verify the behavior, or does it pass trivially?
  • Load testing — does the endpoint handle 1,000 concurrent requests?
  • Security scanning — are there known vulnerabilities in dependencies?
  • Integration testing — do the services actually communicate correctly?
  • Performance budgets — does the p95 response time stay under 200ms?

The typed approach handles Stage 1 (structural correctness) in the IDE. CI handles Stages 2-4 (behavioral, performance, security). Both are necessary. But the structural defects — the ones that account for the majority of "why didn't we catch this sooner?" moments — are eliminated before the developer even saves the file.


Validation Depth: What Each Approach Can and Cannot Catch

Not all defects are created equal. Some are structural (wrong types, missing methods). Some are semantic (wrong logic, incorrect behavior). Some are operational (performance degradation, resource leaks). Each approach catches different defect types at different stages.

The Comprehensive Defect Matrix

Defect Category Specific Defect Spec-Driven (When?) Typed Specs (When?)
Structural
Missing implementation for AC CI quality gate (post-test) Compile time (REQ101)
Missing test for AC CI coverage check (post-test) Compile time (REQ301)
Stale test referencing deleted AC Never (test still passes) Compile time (nameof fails)
Wrong method signature on spec Never (no contract) Compile time (CS0535)
Feature with no parent epic Never (no hierarchy check) Compile time (generic constraint)
Orphan implementation (no requirement) Never (no link) Traceability report (build time)
Inconsistent cross-service types Never (string-based IDs) Compile time (shared value types)
Semantic
Implementation logic is incorrect Test failure (CI) Test failure (CI)
AC implemented but behavior is wrong Test failure or QA Test failure or QA
Edge case not covered by tests Mutation testing (CI) Mutation testing (CI)
Race condition in concurrent code Load test or production Load test or production
Business rule interpreted incorrectly QA or production QA or production
Performance
Endpoint exceeds latency budget Load test (CI) Load test (CI)
Memory leak under sustained load Load test (CI) Load test (CI)
N+1 query pattern Code review or APM Code review or APM
Missing database index Load test or production Load test or production
Connection pool exhaustion Load test or production Load test or production
Security
SQL injection vulnerability SAST scan (CI) SAST scan (CI)
Missing authorization check Security test (CI) Security test (CI) + analyzer*
Exposed sensitive data in logs Code review or SAST Code review or SAST
Dependency with known CVE Dependency scan (CI) Dependency scan (CI)
Missing input validation Test or penetration test Test or penetration test
Operational
Missing health check endpoint Deployment test Deployment test
Incorrect connection string Integration test (CI) Integration test (CI)
Missing retry policy on HTTP calls Code review Analyzer* (if custom rule exists)
Log level too verbose for production Code review Analyzer* (if custom rule exists)
Missing circuit breaker Code review Code review

*Asterisked items indicate that custom Roslyn analyzers CAN catch these defects, but they require additional investment beyond the Requirements DSL. The spec-driven approach describes these checks in documents; the typed approach can implement them as analyzers — but the implementation is additional work.

Reading the Matrix

Three patterns emerge:

Pattern 1: Structural defects are the typed approach's stronghold. Every structural defect in the table is caught at compile time by the typed approach and caught late (or never) by the spec-driven approach. This is the core value proposition of typed specifications: the defect category that causes the most rework is eliminated at the earliest possible moment.

Pattern 2: Semantic, performance, and security defects are caught identically. Both approaches rely on the same tools (tests, load tests, SAST scanners, code review) for non-structural defects. The typed approach doesn't help you find N+1 queries or SQL injection — those are runtime concerns that no type system catches.

Pattern 3: The spec-driven approach describes all defect categories; the typed approach catches one category. The Testing-as-Code specification has sections on mutation testing, chaos engineering, fuzz testing, and security scanning. The typed approach enforces requirement coverage. For a team that needs guidance on WHAT to test, the spec-driven approach is more helpful. For a team that needs enforcement on WHETHER they tested, the typed approach is more helpful.

The "Never" Column

The most striking entries in the matrix are the ones where the spec-driven approach says "Never": stale tests, orphan implementations, inconsistent cross-service types. These are defects that the spec-driven approach cannot catch because the connection between documents and code is not structural. No CI quality gate can detect that a test references a deleted AC — because the quality gate doesn't know which tests correspond to which ACs. It knows coverage percentages, not coverage targets.

The typed approach catches these at compile time — not because of some clever trick, but because nameof(DeletedAC) is a compile error when DeletedAC no longer exists. The defect literally cannot be introduced. It's not caught — it's prevented.

The Pragmatic View

A team adopting the typed approach should not abandon CI quality gates. The correct strategy is layered:

Layer 1: IDE (Roslyn Analyzers)     → Structural defects          → Milliseconds
Layer 2: Pre-commit (fast tests)    → Basic semantic defects      → Seconds
Layer 3: CI (full test suite)       → Semantic + edge cases       → Minutes
Layer 4: CI (mutation testing)      → Test quality                → Minutes
Layer 5: CI (security scanning)     → Security defects            → Minutes
Layer 6: CI (load testing)          → Performance defects         → Minutes
Layer 7: Staging (E2E + smoke)      → Integration defects         → Hours
Layer 8: Production (monitoring)    → Operational defects         → Ongoing

The typed approach replaces Layer 1 with something radically better than what existed before (nothing, or linting). It doesn't replace Layers 2-8. The spec-driven approach describes all eight layers in documents — valuable guidance. But describing a layer and implementing a layer are different things. The typed approach implements Layer 1; the spec-driven approach describes all layers but implements none.

But here's the trajectory: every layer can become a typed DSL. Layer 4 (mutation testing) can be a [MutationTarget] attribute. Layer 5 (security scanning) can be a [SecurityPolicy] attribute. Layer 6 (load testing) can be a [LoadTest] attribute. Layer 7 (E2E) can be a [UserJourney] attribute. Layer 8 (monitoring) can be an [Alert] attribute.

When fully realized, the eight layers look like this:

Layer 1: IDE (Roslyn Analyzers)     → [ForRequirement] + REQ1xx-REQ4xx   → Typed ✓
Layer 2: Pre-commit (fast tests)    → [Verifies] + [TestsFor]            → Typed ✓
Layer 3: CI (full test suite)       → [Verifies] + test runner            → Typed ✓
Layer 4: CI (mutation testing)      → [MutationTarget(typeof(Feature))]   → Typed ✓
Layer 5: CI (security scanning)     → [SecurityPolicy(typeof(Endpoint))]  → Typed ✓
Layer 6: CI (load testing)          → [LoadTest] + [PerformanceBudget]    → Typed ✓
Layer 7: Staging (E2E + smoke)      → [UserJourney(typeof(Feature))]      → Typed ✓
Layer 8: Production (monitoring)    → [Alert] + [HealthCheck]             → Typed ✓

The spec-driven approach describes all eight layers in text. The typed approach can implement all eight layers in the compiler. Not today — not all these DSLs exist yet. But the architecture supports it. Each DSL is a [MetaConcept] that self-registers in the M3 metamodel. The framework is extensible by design.

This is the deeper argument: the typed approach is not limited to the requirement chain. The requirement chain is where it started, but the M3 meta-metamodel is a general-purpose framework for turning any domain concern into a compiler-enforced DSL. Requirements, operations, testing, security, performance — they're all domains. They're all candidates for DSLs. They all benefit from the same properties: compile-time validation, source generation, IDE integration, drift prevention.

The spec-driven approach says "here are eight important concerns, described in documents." The typed approach says "here are eight important concerns, each expressible as a compiler-enforced DSL." One is a description. The other is a design principle.


The Enforcement Asymmetry

There's a fundamental asymmetry between the two approaches that becomes clear when you look at enforcement holistically:

Spec-driven enforcement is additive. Every new concern requires adding a new enforcement mechanism: a new CI check, a new quality gate, a new coverage tool, a new linting rule. The enforcement surface area grows linearly with the number of concerns. Each mechanism is independent — the coverage tool doesn't know about the security scanner, which doesn't know about the load test, which doesn't know about the mutation test.

Spec-driven enforcement:

Concern 1: Feature coverage    → CI tool A (coverage.py/coverlet)
Concern 2: Code quality        → CI tool B (sonarqube/codeclimate)
Concern 3: Security            → CI tool C (snyk/semgrep)
Concern 4: Performance         → CI tool D (k6/locust)
Concern 5: Mutation testing    → CI tool E (stryker/mutmut)
Concern 6: Documentation       → CI tool F (custom script)
Concern 7: Architecture        → CI tool G (netarchtest/archunit)
Concern 8: Dependency health   → CI tool H (dependabot/renovate)

Total: 8 independent tools, 8 configurations, 8 maintenance burdens
Cross-concern validation: None (tool A doesn't know about tool C)

Typed specification enforcement is structural. Every concern is a DSL — an attribute set processed by a source generator and validated by an analyzer. All DSLs share the M3 metamodel, the same five-stage pipeline, and the same IDE integration. Adding a new concern means adding one attribute library and one generator — not a new CI tool with its own configuration format, its own output format, and its own failure modes.

Typed specification enforcement:

Concern 1: Feature coverage    → [ForRequirement] + REQ1xx analyzer
Concern 2: Code quality        → [Invariant] + [Validated] + analyzers
Concern 3: Security            → [SecurityPolicy] + SEC1xx analyzer
Concern 4: Performance         → [PerformanceBudget] + PERF1xx analyzer
Concern 5: Mutation testing    → [MutationTarget] + MUT1xx analyzer
Concern 6: Documentation       → Type system IS documentation
Concern 7: Architecture        → [Layer] + ARCH1xx analyzer
Concern 8: Dependency health   → [Dependency] + DEP1xx analyzer

Total: 1 framework (M3 + Roslyn), N attribute sets, N analyzers
Cross-concern validation: Built-in (all concerns share the type system)

The cross-concern validation is the key. In the typed approach, a [PerformanceBudget] can reference a Feature via typeof(). A [SecurityPolicy] can reference an endpoint that's annotated with [ForRequirement]. A [LoadTest] can reference the same Feature that the [Verifies] tests reference. All concerns are connected through the type system.

In the spec-driven approach, the performance-targets document and the testing document and the security document are separate text files. The connection between "this endpoint needs < 200ms response time" and "this endpoint implements Feature X" and "this endpoint must pass security scan Y" exists only in the reader's head. No tool validates the cross-references.

The Cost Implication

This asymmetry has a cost implication at scale:

Enforcement cost as concerns grow:

Cost │          Spec-Driven (linear)
     │         /
     │        /
     │       / ← Each new concern = new CI tool + config + maintenance
     │      /
     │     /
     │    /  Typed Specifications (sublinear)
     │   /  ──────────────────────
     │  / ──/─── ← Each new concern = attribute set + generator
     │ /──/      (reuses existing M3 framework)
     │/─/
     ├──────────────────────────────────
     2    4    6    8    10   12  concerns

The spec-driven approach pays a roughly linear cost per concern: each new CI tool has its own learning curve, configuration format, and maintenance burden. The typed approach pays a sublinear cost: the first DSL is expensive (build the M3 framework), but each subsequent DSL reuses the same infrastructure. By the 5th or 6th DSL, the marginal cost of adding a new typed concern is a fraction of adding a new CI tool.

This is why the typed approach's initial investment — which looks expensive on Day 1 — pays off over time. It's not just about the requirement chain. It's about building a platform for compiler-enforced concerns that scales sublinearly.


The Validation Unification

The spec-driven approach validates eight concerns with eight separate tools. Each tool has its own configuration format, its own output format, its own failure modes, and its own integration requirements. Let's count what a typical CI pipeline looks like:

Spec-Driven Validation Pipeline (8 tools, 8 configs, 8 outputs):

Tool 1: coverlet          → coverage.cobertura.xml   → config: coverlet.runsettings
Tool 2: sonarqube         → sonar-report.json        → config: sonar-project.properties
Tool 3: snyk              → snyk-report.json         → config: .snyk
Tool 4: k6                → k6-results.json          → config: load-test.js
Tool 5: stryker           → mutation-report.html      → config: stryker-config.json
Tool 6: custom script     → doc-coverage.txt         → config: doc-check.yaml
Tool 7: netarchtest       → arch-test-results.xml    → config: ArchTests.cs
Tool 8: dependabot        → dependabot-alerts.json   → config: .github/dependabot.yml

Total configuration files: 8
Total output formats: 5 (XML, JSON, HTML, TXT, YAML)
Total dashboards needed: 8 (or 1 aggregator like SonarQube that partially unifies)
Total failure modes: 8 independent (tool 3 can fail while tools 1-2, 4-8 succeed)
Cross-tool validation: None

Each tool is an island. The coverage tool doesn't know which features are security-critical. The security scanner doesn't know which endpoints have performance budgets. The architecture enforcement tool doesn't know which layers contain features with compliance requirements. The mutation testing tool doesn't know which features have acceptance criteria.

The typed approach unifies all validation concerns into a single analyzer framework. Every concern is a Roslyn analyzer family with a consistent diagnostic ID scheme. (The Auto-Documentation from a Typed System series demonstrates this unification across five operational sub-DSLs, all sharing the same analyzer infrastructure.)

The Unified Analyzer Taxonomy

REQ1xx: Requirement Coverage
  REQ100  Feature has no specification interface          Error
  REQ101  AC has no matching spec method                  Error
  REQ102  Story has no specification (small story)        Warning
  REQ103  Feature fully specified                         Info

REQ2xx: Specification Implementation
  REQ200  Spec interface has no implementing class        Error
  REQ201  Implementation missing [ForRequirement]         Warning
  REQ202  Implementation missing method-level attribute   Warning
  REQ203  Spec fully implemented                          Info

REQ3xx: Test Coverage
  REQ300  Feature has no [TestsFor] test class            Error
  REQ301  AC has no [Verifies] test                       Warning
  REQ302  [Verifies] references nonexistent AC (stale)    Warning
  REQ303  Feature fully tested                            Info

REQ4xx: Quality Gates
  REQ400  Feature test pass rate below minimum            Error
  REQ401  Feature AC coverage below threshold             Warning
  REQ402  Test duration exceeds budget                    Warning
  REQ403  Feature passes all quality gates                Info

PERF1xx: Performance Budgets
  PERF100  Feature has [PerformanceTest] — configured     Info
  PERF101  Feature has no [PerformanceTest]               Warning
  PERF102  Endpoint has no matching performance budget    Warning
  PERF103  P95/P99 threshold parse error                  Error
  PERF104  Performance budget references deleted feature  Error
  PERF105  Performance test without functional tests      Warning

SEC1xx: Security Policies
  SEC100  Endpoint missing [Authorize] attribute          Error
  SEC101  Controller missing CORS policy                  Warning
  SEC102  Action accepts user input without validation    Warning
  SEC103  Sensitive data type used in API response        Error
  SEC104  Security policy references deleted endpoint     Error
  SEC105  All security policies satisfied                 Info

ARCH1xx: Architecture Constraints
  ARCH100  Domain layer references infrastructure         Error
  ARCH101  API layer references domain directly           Warning
  ARCH102  Circular dependency between assemblies         Error
  ARCH103  Feature implementation in wrong layer          Warning
  ARCH104  Generated code manually modified               Error
  ARCH105  Architecture constraints satisfied             Info

OPS1xx: Operational Readiness
  OPS100  Service missing health check endpoint           Error
  OPS101  Service missing readiness probe                 Warning
  OPS102  Alert definition references unknown metric      Error
  OPS103  Deployment strategy missing rollback condition  Warning
  OPS104  Chaos experiment targets deleted service        Error
  OPS105  All operational readiness checks passed         Info

The Unified Configuration

All eight concern families are configured in a single .editorconfig file. No separate configuration files. No separate tools. No separate output formats:

# .editorconfig — ALL validation concerns, one file

[*.cs]

# ── Requirement Coverage ──────────────────────────────────
dotnet_diagnostic.REQ100.severity = error
dotnet_diagnostic.REQ101.severity = error
dotnet_diagnostic.REQ102.severity = suggestion
dotnet_diagnostic.REQ103.severity = none          # suppress info for clean output

# ── Specification Implementation ──────────────────────────
dotnet_diagnostic.REQ200.severity = error
dotnet_diagnostic.REQ201.severity = warning
dotnet_diagnostic.REQ202.severity = suggestion
dotnet_diagnostic.REQ203.severity = none

# ── Test Coverage ─────────────────────────────────────────
dotnet_diagnostic.REQ300.severity = error
dotnet_diagnostic.REQ301.severity = warning        # allow missing tests during dev
dotnet_diagnostic.REQ302.severity = error          # stale tests are always errors

# ── Quality Gates ─────────────────────────────────────────
dotnet_diagnostic.REQ400.severity = error
dotnet_diagnostic.REQ401.severity = warning
dotnet_diagnostic.REQ402.severity = warning

# ── Performance Budgets ───────────────────────────────────
dotnet_diagnostic.PERF100.severity = none
dotnet_diagnostic.PERF101.severity = suggestion    # not all features need perf tests
dotnet_diagnostic.PERF102.severity = warning
dotnet_diagnostic.PERF103.severity = error
dotnet_diagnostic.PERF104.severity = error
dotnet_diagnostic.PERF105.severity = warning       # perf without functional = risky

# ── Security Policies ─────────────────────────────────────
dotnet_diagnostic.SEC100.severity = error          # missing auth is always an error
dotnet_diagnostic.SEC101.severity = warning
dotnet_diagnostic.SEC102.severity = warning
dotnet_diagnostic.SEC103.severity = error          # leaking PII is always an error
dotnet_diagnostic.SEC104.severity = error

# ── Architecture Constraints ──────────────────────────────
dotnet_diagnostic.ARCH100.severity = error         # layer violations break architecture
dotnet_diagnostic.ARCH101.severity = warning
dotnet_diagnostic.ARCH102.severity = error         # circular deps are always errors
dotnet_diagnostic.ARCH103.severity = warning
dotnet_diagnostic.ARCH104.severity = error         # never hand-edit generated code

# ── Operational Readiness ─────────────────────────────────
dotnet_diagnostic.OPS100.severity = error          # no health check = no deploy
dotnet_diagnostic.OPS101.severity = warning
dotnet_diagnostic.OPS102.severity = error
dotnet_diagnostic.OPS103.severity = warning
dotnet_diagnostic.OPS104.severity = error

# ── Per-project overrides ─────────────────────────────────
# In test projects, relax architecture rules
[*Tests/**/*.cs]
dotnet_diagnostic.ARCH100.severity = suggestion
dotnet_diagnostic.ARCH101.severity = none

# In prototype projects, relax everything to warnings
[*Prototype/**/*.cs]
dotnet_diagnostic.REQ100.severity = warning
dotnet_diagnostic.REQ300.severity = warning
dotnet_diagnostic.SEC100.severity = warning

A Single Build Output

With the unified analyzer framework, a single dotnet build reports ALL validation results:

$ dotnet build MyApp.sln

Build started...

# ── Requirement Coverage ──────────────────────────────────
error REQ100: JwtRefreshStory has 2 ACs but no specification interface
warning REQ102: AssignRoleStory has no specification (small story, acceptable)
info REQ103: OrderProcessingFeature — all ACs fully specified ✓
info REQ103: PasswordResetFeature — all ACs fully specified ✓

# ── Specification Implementation ──────────────────────────
error REQ200: IJwtRefreshSpec is not implemented by any class
info REQ203: IOrderProcessingSpec — fully implemented ✓
info REQ203: IPasswordResetSpec — fully implemented ✓

# ── Test Coverage ─────────────────────────────────────────
warning REQ301: UserRolesFeature.AdminCanRevokeRoles has no [Verifies] test
info REQ303: OrderProcessingFeature — all ACs fully tested ✓
info REQ303: PasswordResetFeature — all ACs fully tested ✓

# ── Performance Budgets ───────────────────────────────────
info PERF100: OrderProcessingFeature — P95=200ms, P99=500ms configured ✓
warning PERF105: PasswordResetFeature has [PerformanceTest] but AC
                 'ResetLinkCanOnlyBeUsedOnce' has no [Verifies] test

# ── Security Policies ─────────────────────────────────────
error SEC100: OrdersController.CancelOrder missing [Authorize] attribute
warning SEC102: OrdersController.CreateOrder accepts OrderDto without
                [ValidateInput] attribute
info SEC105: PaymentController — all security policies satisfied ✓

# ── Architecture Constraints ──────────────────────────────
error ARCH100: MyApp.Domain.OrderService references
               MyApp.Infrastructure.SqlOrderRepository directly
info ARCH105: MyApp.Api — architecture constraints satisfied ✓

# ── Operational Readiness ─────────────────────────────────
error OPS100: PaymentService has no /health endpoint
warning OPS103: CommerceDeployment has [DeploymentStrategy] but no
                [RollbackCondition] defined

Build failed.
  3 error(s): REQ100, SEC100, ARCH100, OPS100
  5 warning(s): REQ102, REQ301, PERF105, SEC102, OPS103
  7 info(s): REQ103(x2), REQ203(x2), REQ303(x2), SEC105

The Spec-Driven Equivalent

To get the same validation coverage with the spec-driven approach, you need:

# .github/workflows/quality.yml — 8 separate tools
name: Quality Gates
on: [push]

jobs:
  coverage:
    runs-on: ubuntu-latest
    steps:
      - run: dotnet test --collect:"XPlat Code Coverage"
      - run: reportgenerator -reports:**/coverage.cobertura.xml
      # Output: coverage.cobertura.xml → parse for threshold check
      # Does NOT know which ACs are covered

  security:
    runs-on: ubuntu-latest
    steps:
      - run: snyk test --severity-threshold=high
      # Output: snyk-report.json → separate format from coverage
      # Does NOT know which features are security-critical

  architecture:
    runs-on: ubuntu-latest
    steps:
      - run: dotnet test --filter "Category=Architecture"
      # Output: test-results.xml → yet another format
      # Does NOT know which features violate constraints

  performance:
    runs-on: ubuntu-latest
    steps:
      - run: k6 run load-test.js
      # Output: k6-results.json → yet another format
      # Does NOT know which features have performance budgets

  mutation:
    runs-on: ubuntu-latest
    steps:
      - run: dotnet stryker
      # Output: mutation-report.html → yet another format
      # Does NOT know which features need mutation testing

  # ... 3 more jobs for docs, dependencies, operational readiness

  aggregate:
    needs: [coverage, security, architecture, performance, mutation]
    steps:
      - run: python check_all_gates.py  # Custom aggregation script
      # Must parse 5+ different output formats
      # Must correlate results across tools (which tool found which problem?)
      # Must decide: does the combination of results pass or fail?

The comparison:

Dimension 8 CI Tools (Spec-Driven) Unified Analyzers (Typed)
Configuration files 8 (each tool's format) 1 (.editorconfig)
Output formats 5+ (XML, JSON, HTML, TXT, YAML) 1 (MSBuild diagnostic format)
Cross-concern linking None (tools are independent) Built-in (all share type system)
Feature-level reporting None (tool-level reporting) Per-feature, per-AC
Aggregation Custom script needed Built into dotnet build
IDE integration External (run CI, read report) Native (squiggly underlines)
Timing After push (minutes) During typing (milliseconds)
New concern cost New tool + config + parser New analyzer family (same framework)
Dashboard 8 separate or 1 aggregator Build output + IDE
Failure correlation Manual ("which tool found this?") Automatic (diagnostic ID)

The typed approach doesn't just unify validation output. It unifies the validation model. Every concern references the same type system. A PERF105 warning ("performance test without functional tests") can exist because the performance analyzer can see the [Verifies] attributes from the test analyzer. A SEC100 error ("missing [Authorize]") can reference the feature it affects because the security analyzer can see the [ForRequirement] attribute on the controller. Cross-concern validation isn't a feature — it's a consequence of sharing a type system.

Eight separate CI tools cannot cross-reference because they don't share a data model. They share a pipeline, but not a language.


The False Dichotomy: Breadth vs Depth

The standard framing of the comparison goes like this: "Spec-driven has breadth (covers everything). Typed specifications have depth (enforces the requirement chain). Choose based on whether you need breadth or depth."

This framing is wrong. It's a false dichotomy. And accepting it sells the typed approach short.

The Dichotomy as Presented

The "Standard" Comparison (FALSE):

                    Breadth
                      ↑
                      │
   Spec-Driven ●      │
   (15 concerns,      │
    0 enforced)       │
                      │
                      │
                      │              ● Typed Specs
                      │              (1 concern,
                      │               1 enforced)
                      │
                      └──────────────────→ Depth

This diagram suggests a tradeoff: you can have broad coverage with no enforcement, or deep enforcement with narrow coverage. The implication is that both are valid choices depending on your needs.

But this is not a real tradeoff. It's a snapshot of a transitional state.

The Real Picture

The typed approach is not limited to one concern. It started with one concern (the requirement chain) because that's the most valuable concern to enforce first. But the architecture — attributes, source generators, Roslyn analyzers, the M3 metamodel — supports unlimited concerns. Every domain that the spec-driven approach covers with text, the typed approach can cover with a DSL.

The REAL Comparison:

                    Breadth
                      ↑
                      │
   Spec-Driven ●      │           ● Typed Specs (future)
   (15 concerns,      │           (15 concerns,
    0 enforced)       │            15 enforced)
                      │          /
                      │         /  ← Each new DSL adds
                      │        /     a concern WITH enforcement
                      │       /
                      │      /
                      │     ● Typed Specs (today)
                      │     (3-5 concerns,
                      │      3-5 enforced)
                      │    /
                      │   /
                      │  ● Typed Specs (Day 1)
                      │  (1 concern,
                      │   1 enforced)
                      │
                      └──────────────────→ Depth

   Key insight: The typed approach MOVES. The spec-driven approach STAYS.

The spec-driven approach starts broad and stays broad. Adding a new concern means writing another text document. That document has zero enforcement on Day 1 and zero enforcement on Day 1,000. The breadth never converts to depth.

The typed approach starts narrow and grows. Adding a new concern means building a DSL. That DSL has full enforcement from the moment it's built. Over time, the typed approach accumulates breadth AND depth. It converges toward the spec-driven approach's breadth while maintaining the enforcement the spec-driven approach cannot provide.

The Real Dichotomy

The actual choice is not breadth vs depth. It's enforced vs described:

The REAL Dichotomy:

                    Concerns
                      ↑
                      │
                      │  ┌─────────────────────────────┐
                      │  │   DESCRIBED (spec-driven)    │
                      │  │                               │
                      │  │  - Performance targets        │
                      │  │  - Security policies          │
                      │  │  - Architecture constraints   │
                      │  │  - Testing strategies         │
                      │  │  - Monitoring alerts           │
                      │  │  - Deployment strategies      │
                      │  │  - Chaos experiments          │
                      │  │  - Compliance requirements    │
                      │  │                               │
                      │  │  Status: Text. No compiler.   │
                      │  │  Drift: Inevitable.           │
                      │  │  Verification: Manual.        │
                      │  └─────────────────────────────┘
                      │
                      │  ┌─────────────────────────────┐
                      │  │   ENFORCED (typed DSLs)      │
                      │  │                               │
                      │  │  - [ForRequirement] chain     │
                      │  │  - [PerformanceBudget]        │
                      │  │  - [SecurityPolicy]           │
                      │  │  - [Layer] constraints        │
                      │  │  - [MutationTarget]           │
                      │  │  - [Alert] definitions        │
                      │  │  - [DeploymentStrategy]       │
                      │  │  - [ChaosExperiment]          │
                      │  │                               │
                      │  │  Status: Compiled. Type-safe. │
                      │  │  Drift: Impossible.           │
                      │  │  Verification: Automatic.     │
                      │  └─────────────────────────────┘
                      │
                      └──────────────────────────────→ Enforcement

   Every concern in the DESCRIBED box can move to the ENFORCED box.
   No concern in the ENFORCED box ever moves back.
   The direction of travel is one-way: described → enforced.

This is the fundamental insight: describing a concern is the first step toward enforcing it. The spec-driven approach describes concerns and stops. The typed approach takes the additional step of building the DSL that enforces them.

"But building DSLs takes effort," the objection goes. Yes. But maintaining documents also takes effort — it's just invisible effort. The document maintenance doesn't produce error messages when it fails. It produces stale documents. Nobody knows they're stale until the wrong decision gets made based on outdated information.

The DSL maintenance is visible: a generator bug produces incorrect generated code, which produces a test failure, which gets fixed. The document maintenance is invisible: a stale performance target sits in a markdown file for six months until someone relies on it and deploys an endpoint that can't handle the load.

Visible maintenance is better than invisible maintenance. The same effort, spent on DSLs instead of documents, produces artifacts that cannot drift. That's not a tradeoff. That's a strictly better outcome.

What "Breadth" Really Means

When people say the spec-driven approach has "breadth," they mean it covers many concerns. But coverage without enforcement is inventory, not capability. A warehouse full of unplugged machines has "breadth" — it covers many manufacturing processes. But it doesn't manufacture anything.

The spec-driven Testing-as-Code specification describes 15+ testing strategies. How many does it enforce? Zero. The description is valuable as education — but education is a one-time transfer. Once the team knows about mutation testing, the document's ongoing value is zero. The ongoing value comes from enforcement: does the team ACTUALLY mutation-test? The document can't answer that. A [MutationTarget] analyzer can.

This is not an argument against documents. Documents are excellent for teaching. The spec-driven Testing-as-Code specification is a genuine contribution to the testing knowledge base. The argument is against stopping at documents. Teaching is the first step. Enforcement is the second step. The spec-driven approach takes the first step and declares victory. The typed approach takes both steps.

The Transitional Illusion

The spec-driven approach's "breadth advantage" is real on Day 1. The typed approach has one DSL; the spec-driven approach has 15 documented concerns. The gap is visible, and it feels like a fundamental difference.

But it's a transitional state, not a permanent one. Every month, the typed approach can add another DSL. Every month, a "described" concern becomes an "enforced" concern. The spec-driven approach's 15 concerns stay at zero enforcement month after month. The gap closes from one direction only: the typed approach gains breadth while keeping depth.

Breadth over time:

Month  │ Spec-Driven    │ Typed (enforced) │ Typed (total concerns)
───────┼────────────────┼──────────────────┼───────────────────────
  1    │ 15 described   │ 1 enforced       │ 1 of 15
  3    │ 15 described   │ 3 enforced       │ 3 of 15
  6    │ 15 described   │ 5 enforced       │ 5 of 15
 12    │ 15 described*  │ 8 enforced       │ 8 of 15
 18    │ 14 described** │ 11 enforced      │ 11 of 15
 24    │ 12 described** │ 15 enforced      │ 15 of 15

 *  Some spec-driven documents are now stale (never updated)
 ** Some spec-driven documents have been abandoned (nobody reads them)

The spec-driven "breadth" is also not stable. Documents require maintenance. Without it, they drift, become stale, and eventually get abandoned. The 15 concerns that are described on Day 1 might be 12 actively-maintained concerns by Month 24. Meanwhile, the typed approach has grown from 1 to 15 enforced concerns — each one guaranteed to be current because it's compiled, not maintained.

The question isn't "which approach has more breadth?" It's "which approach's breadth is still accurate in two years?" Documents decay. Types don't. The transitional advantage of document breadth is exactly that — transitional. The permanent advantage of type enforcement is exactly that — permanent.

⬇ Download