Chapter 19 — Lessons and Anti-Patterns
A DSL earns its keep not by what it lets you write, but by what it refuses to let you write.
The previous eighteen chapters have been constructive. They described what @frenchexdev/requirements models — Requirements, Features, decorators, styles, refinement graphs, compliance reports — and how each piece dog-foods the package's own ontology. This chapter is the negative print. It catalogues seven specific failure modes that other traceability approaches exhibit routinely, and shows how the DSL's type signatures rule each of them out by construction.
None of these seven anti-patterns is hypothetical. Each one rose from real pain — my own, mostly — in the predecessor typed-specs/ series, or in the string-based describe/it trackers that preceded it, or in the ticket-based Jira/Rally/Azure-DevOps layers above those. Each one cost hours. Each one is closed here by a type.
The chapter closes with the symmetric question: when is this DSL the wrong answer? That section is short, and deliberately honest. Not every project needs a Requirement stratum. A throwaway shell script does not. A two-file CLI does not. A team that will not dog-food its own tools does not. Knowing where the DSL stops being worth its cost is part of what makes the positive case credible.
A note on how the chapter is structured
Because the seven anti-patterns interact with each other in non-obvious ways, I have written each section to stand alone and also to refer, where useful, to sibling sections. Reading straight through from AP1 to AP7 gives the full taxonomy. Reading a single section in isolation — say, AP4 when triaging a specific stale-reference incident — gives the mechanism and the fix without requiring the full context. The cross-references are minimal but load-bearing where present.
The taxonomy diagram above is the single best map of the chapter. If you read nothing else, reading the diagram and noting which mechanism a given symptom belongs to will orient you to the relevant section. The detailed arguments in each section are meant to support the diagram, not to replace it.
One more structural note: each anti-pattern section has a main argument and one or more short sub-sections that qualify, extend, or complicate the argument. The sub-sections exist because every one of these anti-patterns has edge cases that the main argument does not cover. I have learned, from earlier iterations of this writing, that presenting only the clean main argument invites the objection "but what about X?", and the sub-sections are there to address exactly those objections in advance. The chapter is longer than it would be if I aimed only at the clean cases, and that length is deliberate.
Seven anti-patterns, one picture
Before walking through them one by one, here is the taxonomy.
Five mechanisms. Seven anti-patterns. Seven fixes. The rest of this chapter takes each one in turn, shows the shape of the mistake, and quotes the specific type signature or scanner phase that rules it out.
The diagram groups the seven anti-patterns under five mechanisms on the left, lists the anti-patterns themselves in the middle, and names the type-level fix on the right. The groupings are not arbitrary. Each mechanism represents a distinct kind of failure: drift is the gradual decay of a correspondence, string-magic is the absence of a compiler-checked name, orphan is the missing binding between artefact and specification, stale is a dangling reference to a deleted entity, and collapse is the structural conflation of two distinct types into one.
Recognising the mechanism matters because the fix is mechanism-specific. Drift is closed by replacing strings with class references, wherever the correspondence would otherwise decay. String-magic is closed by keyof T generics, wherever a string argument should be constrained to a known set. Orphan is closed by mandatory bindings surfaced at build-time. Stale is closed by compile-time import resolution. Collapse is closed by introducing the missing type — in this DSL's case, the Requirement stratum that typed-specs never declared.
A note on the diagram's implicit claim: the five mechanisms form a small taxonomy that I believe generalises beyond this package. Any traceability tooling will face the same five failure modes in some form, because they are not artefacts of TypeScript but of the structural mismatch between free-form annotation and checked modelling. The DSL's specific fixes are TypeScript-flavoured; the mechanisms they fix are language-agnostic. I do not labour this point in the body of the chapter, but a reader transposing these ideas to a Rust, OCaml, or C# environment should expect the same five categories to apply, with language-specific variations on the fix column.
Anti-pattern 1 — describe/it drift
The default Jest/Vitest pattern is a free-form nested-string tree:
describe('user login', () => {
describe('when credentials are valid', () => {
it('redirects to dashboard', () => { /* ... */ });
it('sets the session cookie', () => { /* ... */ });
});
describe('when password is wrong', () => {
it('shows error', () => { /* ... */ });
});
});describe('user login', () => {
describe('when credentials are valid', () => {
it('redirects to dashboard', () => { /* ... */ });
it('sets the session cookie', () => { /* ... */ });
});
describe('when password is wrong', () => {
it('shows error', () => { /* ... */ });
});
});The shape is familiar. The problem is equally familiar: none of those strings is type-checked against anything. Two weeks later, someone renames the feature from user login to user authentication. The describe block keeps the old name, because nobody grepped for it. The PM's acceptance criterion was "show an error when password is wrong" — but the test says "shows error", a paraphrase, not the spec. The test passes. The spec is elsewhere, in a ticket, written in different words. Drift begins on day one and accelerates.
I have three concrete observations from the years before typed-specs:
- The
describestrings drift from the Feature name. Nobody notices, because nobody runs a report that correlates them. - The
itstrings drift from the acceptance criterion. Nobody notices, because the acceptance criterion lives in a tracker, not in the repo. - The nesting drifts from the architecture. A test nested three
describelevels deep under controller > login > valid is indistinguishable at the compiler level from one nested under service > auth > success. Same type. Different meanings. No way to tell.
The DSL closes this by moving the binding from a string to a decorator with a class parameter:
@FeatureTest(FeatureTraceExplorerTuiFeature)
export class FeatureTraceExplorerTuiTest
extends FeatureTraceExplorerTuiFeature {
@Verifies<FeatureTraceExplorerTuiFeature>('traceExplorerBuildsGraph')
traceExplorerBuildsGraph(): ACResult { /* ... */ }
}@FeatureTest(FeatureTraceExplorerTuiFeature)
export class FeatureTraceExplorerTuiTest
extends FeatureTraceExplorerTuiFeature {
@Verifies<FeatureTraceExplorerTuiFeature>('traceExplorerBuildsGraph')
traceExplorerBuildsGraph(): ACResult { /* ... */ }
}FeatureTraceExplorerTuiFeature is a class symbol. It cannot drift. If the Feature is renamed, the import breaks, the type breaks, the build fails. The drift window collapses from "weeks until someone notices" to "the keystroke that broke the import".
This is the single largest ergonomic change from a describe/it world to a decorator-with-class-binding world. Everything else in this chapter is, in some sense, a variation on the same move — replace strings that the compiler cannot see with type-level handles that it can.
How the drift actually manifests
It is worth being concrete about the shape drift takes in practice, because the abstract form ("strings get out of sync") does not convey how routine the breakdown is. In the three years I spent on the project that preceded typed-specs, I counted at least five distinct drift patterns that surfaced on post-mortems:
- The rename trail. A Feature renamed from Login to Authentication leaves seventeen
describe('Login', ...)blocks untouched. The grep that would catch them is easy to write in retrospect; nobody wrote it before the rename landed. - The refactor trail. A method
validateCredentials()refactored intovalidateEmail()+validatePassword()leaves the oldit('validates credentials', ...)block covering the new split behaviour, with a single assertion that passes because one half happens to still work. - The copy-paste trail. A
describeblock cloned for a sibling Feature, with the outer title updated but the inneritstrings forgotten, so the new Feature's tests are labelled with the parent's acceptance criteria. - The translation trail. A team with non-English-speaking contributors where the
describetitles drift between English (on the tickets) and the local language (in the tests) — and the compliance report, which reads the tickets, cannot find the matching strings. - The seniority trail. A junior developer who writes
it('works', () => ...)because they do not know the domain term, and no reviewer corrects it because the test passes and the PR has six other things to review.
Each of these five is a specific failure that decorator-with-class-binding eliminates mechanically. The rename trail breaks the import. The refactor trail breaks the keyof T check. The copy-paste trail fails at the @FeatureTest(FeatureClass) decorator because the class cannot be two Features at once. The translation trail never arises because the Feature is a class, not a sentence. The seniority trail fails because it('works', () => ...) is not a legal shape — there is no @Verifies<FeatureClass>('works') unless the Feature declares a method by that name, which no reviewer would approve.
One subtlety on the copy-paste trail worth unpacking: the @FeatureTest decorator's class argument is structurally distinct per Feature. Two Features cannot, by construction, share the same class. If a developer copies a test file and forgets to update the @FeatureTest(X) argument, the test will correctly bind to the source Feature. The bug is that the tests are binding to the wrong Feature, but the DSL's position here is that this is now a bug the compliance report catches: the source Feature appears to have more tests than it should (the target Feature's copied tests), and the target Feature appears to have fewer (its tests are all pointed at the source). Both discrepancies surface in the report. The copy-paste mistake is not invisible; it has a specific signature that the report makes visible.
This is weaker than the hard-type fix of the rename and refactor cases — the compiler does not catch it directly — but it is still stronger than the string-based world, where copied test files with stale titles are invisible to any automated check. The DSL's seven fixes are not uniformly strong. Some are compile-time enforced (AP1 rename, AP2 AC name, AP4 stale class reference). Some are build-time enforced via the compliance gate (AP3 orphan, AP5 requirement-less Feature, AP6 feature-less Requirement). Some are surfacing-rather-than-blocking (the copy-paste sub-case above). The tiered strictness is deliberate: the DSL aims to block the categorical mistakes and surface the intermediate ones, not to make every mistake impossible.
The point is not that the DSL is smarter than the reviewer. The point is that the DSL reduces the number of things the reviewer has to notice. Reviewers miss things. Types do not.
An aside on string-tree conventions
It is worth acknowledging that the JavaScript community has, for a decade, attempted to make describe/it work through sheer discipline. Style guides prescribe titles of the form describe('WhenX', () => { it('ShouldY', ...) }). Linters flag missing describe blocks. Some teams generate compliance reports from test titles via regex.
Each of these interventions is an incremental improvement on the raw describe/it surface, and none of them closes the mechanism at its root. The style-guide discipline drifts under deadline pressure. The linter can only check what the author wrote, not what was meant. The regex-based compliance report is vulnerable to the same drift that motivates it.
The DSL's position is that these partial fixes are well-intentioned but miscalibrated. The mechanism is not "tests lack structure"; it is "tests lack a type-level binding to what they claim to cover". Addressing the binding requires moving from string to class reference, which is exactly what @FeatureTest(FeatureClass) does. The incremental fixes all stop short of this move because they accept, as a premise, that the free-form string surface is load-bearing. Giving up the premise is the move that closes the category.
This is, incidentally, why the DSL uses @FeatureTest as a class-level decorator rather than a file-level convention. A file-level convention ("every test file under test/features/ must bind to a Feature") would be enforceable by a linter but only weakly. A class-level decorator is checked at every occurrence, by the compiler, without any separate enforcement step. The class-level grain is the one at which the binding is mechanical rather than procedural.
A brief history of the string-tree alternatives
For completeness, the chronology of attempted fixes within the describe/it paradigm is worth recording, because it clarifies why the DSL's clean-break approach is the right one:
- Phase 1 — raw
describe/it(2010s). Free-form strings. Drift is total. - Phase 2 — title conventions (2015-onward).
describe('#methodName'),it('should X when Y'). Discipline improves but requires constant vigilance. - Phase 3 — decorator annotations with string arguments (2018-onward).
@Covers('REQ-123')sprayed on test methods. The annotation is explicit but unchecked. - Phase 4 — decorator annotations with
keyof T(2022-onward, typed-specs). The AC name is type-checked against the Feature's methods. Drift on AC names closed. Drift on Requirement names still open. - Phase 5 — decorator annotations with class arguments (2026, present DSL). Every reference is a class symbol. Every drift vector closed at the type level.
Each phase improved on the previous one. Each phase was, at the time, the state of the art. The present DSL is not a break with this progression; it is its continuation. The pattern, across the five phases, is the same pattern: replace one more string with a type. The DSL is what the progression looks like when the last string has been replaced.
I do not claim the DSL is the end of the progression. There are further steps — generic constraints on @Satisfies to require a specific Requirement style, type-level enforcement of the refinement DAG's acyclicity, compile-time verification of AC-test assertion adequacy — that are beyond what TypeScript's type system can express today but may be expressible in a future version, or in a more dependently-typed language. The progression continues. The DSL is a snapshot at Phase 5, and the next snapshot will close failures the current one still leaves open.
Anti-pattern 2 — string-magic AC references
A partial fix of AP1, common in the intermediate regimes I went through before landing on the current shape, is to keep describe/it but add a decorator on the test method that records the acceptance criterion:
class UserLoginTest {
@Verifies('redirectsToDashboard') // string argument
testRedirect() { /* ... */ }
}class UserLoginTest {
@Verifies('redirectsToDashboard') // string argument
testRedirect() { /* ... */ }
}Better than nothing. Still broken. The 'redirectsToDashboard' string is a magic string. If the Feature renames the AC method to redirectsToHome, the test's @Verifies still says redirectsToDashboard, the compiler says nothing, and the compliance report says — whatever it was taught to say, which is usually "this AC is covered" because the scanner does not cross-check.
The DSL closes this with a generic:
@Verifies<FeatureTraceExplorerTuiFeature>('traceExplorerBuildsGraph')@Verifies<FeatureTraceExplorerTuiFeature>('traceExplorerBuildsGraph')The <FeatureTraceExplorerTuiFeature> parameter binds the second argument to keyof FeatureTraceExplorerTuiFeature. The string 'traceExplorerBuildsGraph' is no longer free-form; it is a union of the Feature's abstract method names. Rename the method, the decorator argument becomes a type error. No runtime, no scanner, no CI — the error fires in the editor, on the keystroke.
This is the direct continuity with typed-specs's original contribution. Chapter 03 of typed-specs showed the same keyof T pattern for the AC decorator. The present DSL keeps it unchanged and adds the equivalent class-handle pattern for Feature and Requirement references (@FeatureTest(FeatureClass), @Satisfies(Req1, Req2)). Same mechanism, applied to three different name-binding problems.
The cost of the fix is small: one generic parameter on the decorator. The benefit is large: the entire category of "rename a method, forget to rename the string somewhere else" disappears.
Why keyof T and not an enum
A question I have been asked more than once: why not declare the acceptance criteria as an enum, and have @Verifies take an enum value? The answer reveals the constraint that shaped the DSL.
Enums would work for a single snapshot. They would not work for refinement. A Feature is an abstract class; its acceptance criteria are its abstract methods. When a subclass overrides one, the override is a verification — the test. When a sibling Feature refines a shared base, it inherits the ACs and the keyof T union inherits with it. An enum would have to be duplicated at every refinement level, and the duplication would drift.
The keyof T pattern inherits for free. If FeatureTraceExplorerTuiFeature extends InteractiveTuiFeature and adds three ACs, keyof FeatureTraceExplorerTuiFeature is the parent's AC union plus the three new ones. The @Verifies decorator on the test class picks up the extended union automatically. This is the compositional property that makes the DSL scale past a flat Feature list, and it is the reason the decorator takes keyof T rather than a separate enum.
The property is small in isolation, large in aggregate. At 22 Requirements, 25 Features, and 54 tests — the current repo's size — the number of @Verifies sites is in the low hundreds. If each one had to reference an enum maintained in parallel, the parallel maintenance cost would be the dominant engineering activity. Because keyof T composes with the class hierarchy, the cost is zero.
A further note on editor feedback
A subtler benefit of keyof T over an enum: the editor's autocomplete narrows correctly. When you type @Verifies<FeatureTraceExplorerTuiFeature>(' — the cursor inside the open quote — the TypeScript language service offers exactly the abstract method names of that Feature, as autocomplete suggestions, in the editor pane. An enum would offer the entire enum, regardless of which Feature the decorator annotates. The narrowing is precise, per-decorator, driven by the class parameter.
This is the sort of affordance that does not show up in a pitch but compounds over a thousand writes. The developer adding a new test does not need to open the Feature file, scan its abstract methods, and type the method name from memory. The name is offered in the autocomplete list, scoped to exactly the right Feature, and a Tab key commits it. The risk of typos is zero. The risk of picking a method from the wrong Feature is zero — the wrong Feature's methods are not in the suggestion list at all.
I do not know how to quantify this affordance's value precisely. What I can say is that contributors to the package report discovering new Features by exactly this mechanism: they open a test file, start typing @Verifies<, let the autocomplete complete the Feature class name, then see the full AC list in the inner autocomplete. The DSL becomes, in effect, its own documentation — not through a separate docs site, but through the language service's understanding of the types. That is the ergonomic signal that the type-level approach has paid off.
The parallel case on @Satisfies
The same autocomplete story applies to @Satisfies, one level up. When a developer declares a new Feature and starts typing @Satisfies(, the language service offers every class that extends Requirement — no more, no less. The Requirement list is correctly scoped to the union of declared Requirements in the repo. No string matching, no partial spelling, no guessing at naming conventions. The list is complete and accurate by construction.
This scoping has a second benefit worth naming: the autocomplete list is self-trimming. As the project's Requirement corpus grows, the list of suggestions grows with it. No maintenance step is required to keep the list up to date; the type system does it automatically. Contrast this with a string-enum approach, where the enum would have to be updated manually as new Requirements are added, and out-of-date enums are a known source of drift between the tracker and the code.
At 22 Requirements (the current repo's count), the autocomplete list is long but navigable. At 200 (a plausible size for a full enterprise product), it would be paginated and searchable by prefix. At 2,000 it would need additional tooling — grouping by status, filtering by style, narrowing by refinement parent — but the foundational affordance of "correctly-scoped class-symbol autocomplete" would still hold. The type-level approach scales with the corpus in a way that string-based enumeration does not.
Anti-pattern 3 — orphan tests
A test file with no @FeatureTest at the top:
// no @FeatureTest here
export class FeatureTraceExplorerTuiTest {
@Verifies(/* ... */)
traceExplorerBuildsGraph(): ACResult { /* ... */ }
}// no @FeatureTest here
export class FeatureTraceExplorerTuiTest {
@Verifies(/* ... */)
traceExplorerBuildsGraph(): ACResult { /* ... */ }
}This compiles. The tests run. The test runner reports them as passing. The compliance report, if the scanner is naive, reports them as… what? There is no Feature to correlate them with. They are orphans.
In a string-based world, orphans are silent. The test exists, passes, contributes to the green badge on the PR. But no Feature claims it. No acceptance criterion is covered. The PM reads the green badge as "everything works". Nothing works, because nothing is asked of the tests.
The DSL's compliance scanner, covered in Chapter 13 — Quality Gates and Compliance, has a dedicated phase that detects this. Phase 2 (registry build) records, for every class with a @Verifies decorator, the Feature it claims to belong to — and the claim is only valid if that class also has a @FeatureTest(FeatureClass) decorator. If it does not, the class lands in an "orphan tests" bucket. Phase 3 (gap detection) reports that bucket. Phase 4 renders it in the gate output.
The key point: the gate fails. It does not warn. Under compliance --strict mode, orphan tests are a non-zero exit code. The CI (which is the local build, since this site has no cloud CI) refuses the commit. You cannot ship an orphan test.
I note, in passing, that this is the kind of rule that is easy to agree with in principle and hard to enforce in a string-based world. "Every test should name the Feature it covers" is a policy. Policies are violated weekly. @FeatureTest(FeatureClass) is a type — violating it means the decorator is missing, which is a mechanical, grep-able condition the scanner catches in milliseconds.
The orphan-test false positive and why we do not worry about it
One concern raised by readers of Chapter 13: does the orphan-test check produce false positives on test fixtures, setup helpers, or parameterised test harnesses? The short answer is no, because the scanner's extraction is class-decorator-based, not file-based.
Phase 1 of the compliance scan walks the AST of every file under test/ and collects classes that carry the @Verifies decorator on one or more methods. A file containing only helpers — a testUtils.ts with plain functions, a fixtures/ directory with data objects — is invisible to the scan. Only the orphan case proper is reported: a class with @Verifies methods but no outer @FeatureTest. That shape is exactly the mistake the DSL means to catch, and nothing else.
Parameterised harnesses — a class that runs the same test body across multiple Features — are handled by allowing a single test class to declare multiple @FeatureTest decorators, one per Feature it parameterises. The registry indexes each as an independent binding; no orphan is reported. This is a seldom-used shape in practice, but its existence matters: it means the DSL does not force a one-test-class-per-Feature convention, only a binding-is-mandatory convention. Convention and flexibility, in the right balance.
The adjacency to test coverage tooling
A reader familiar with code-coverage tools — c8, istanbul, the built-in V8 coverage — may ask how the orphan-test check relates. They serve different purposes and complement rather than replace each other.
Code coverage answers the question which lines of production code were executed by tests. It is agnostic about why those lines were executed. A test that exercises code but claims no Feature coverage still contributes to line-coverage metrics. That is a good thing for code-coverage's purpose, but it means code-coverage alone cannot detect orphans.
The orphan check answers the complementary question which tests claim to cover a Feature, and is that claim structurally valid. It is agnostic about which lines were executed. A test can be perfectly bound (no orphan) and cover zero production lines (bug in the test). A test can have high coverage (bug caught the side-effect path) and be an orphan (no one asked for this).
Both checks are needed, and the DSL's position is that they belong in the same gate. compliance --strict fails on orphans. vitest run --coverage fails on coverage below 98%. Both run in the same npm test invocation. Passing both is the condition for a clean build. Either alone is insufficient; together they close the two complementary gaps that a single check would leave open.
A fourth check that I considered and dropped
For completeness: I spent some time exploring a fourth check that would have sat between the orphan check and the coverage check — AC-assertion adequacy. The idea was that the scanner would detect @Verifies methods whose body contains only trivial assertions (expect(true).toBe(true)) or empty ACResult returns, and flag them as inadequate.
I ended up dropping this check for two reasons. First, the detection is heuristic, not structural — there is no type-level distinction between a trivial assertion and a load-bearing one, and the heuristic would produce false positives on legitimately-simple ACs that happen to require only a truthy check. Second, the mechanism of inadequacy is a semantic-drift mechanism (the one I noted earlier as unclosed by the DSL), and it belongs on the human-review side of the gate rather than the automated side.
I record this here because I think the design decision is a useful one. It is tempting to keep adding checks until every conceivable failure mode is caught. The cost of each additional check, though, is non-zero: more code to maintain, more edge cases to handle, more false positives to triage. The discipline of stopping at the seven mechanical checks — and accepting that semantic-drift-shaped failures remain a human responsibility — is part of what keeps the DSL's own compliance gate tractable. A gate that fires on too many false positives gets disabled. A gate that fires only on real structural failures gets respected.
Anti-pattern 4 — stale @Satisfies
The mirror image of AP2, at the Feature level. A Feature declares:
@Satisfies(
ReqDiscoverableTraceabilityRequirement,
ReqDogFoodRequirement,
ReqObsoleteOneRequirement, // deleted from the repo last week
)@Satisfies(
ReqDiscoverableTraceabilityRequirement,
ReqDogFoodRequirement,
ReqObsoleteOneRequirement, // deleted from the repo last week
)In a string-based world, 'REQ-OBSOLETE-ONE' could survive in the decorator list indefinitely. The scanner would either (a) report a non-existent Requirement as satisfied, or (b) silently drop it. Both are wrong. The first corrupts the compliance report with phantom coverage. The second hides an architectural decision — the Requirement was deleted; why is the Feature still claiming to satisfy it?
The DSL closes this by making @Satisfies accept class references, not strings:
import { ReqObsoleteOneRequirement } from '../requirements/req-obsolete-one';
// ^ this import fails to resolve if the file was deletedimport { ReqObsoleteOneRequirement } from '../requirements/req-obsolete-one';
// ^ this import fails to resolve if the file was deletedTypeScript catches stale @Satisfies at compile time. If a Requirement class is deleted, every Feature that imported it breaks. The developer who deletes the Requirement is forced to choose: re-home the Feature under a different Requirement, delete the Feature too, or restore the Requirement. No silent drift. No phantom coverage.
This anti-pattern is, in my experience, one of the most common sources of wrong compliance reports in ticket-based systems. Tickets are deleted, links remain, the "features satisfying this epic" report still lists them because the link is by ID, not by presence. The DSL shifts the integrity check from report-time cross-validation (which nobody trusts) to build-time import resolution (which nobody can override).
The archival case
A legitimate objection to the class-reference approach: what if a Requirement is archived rather than deleted? An approved Requirement that was delivered, then deprecated, but whose historical Feature satisfiers should remain traceable — how does the DSL express that?
The answer is the @RequirementMeta({ status: 'archived' }) state, introduced in Chapter 05 and expanded in Chapter 13c. An archived Requirement remains in the repo as a class. Its file is kept, its imports resolve, its historical Features still @Satisfies it. The compliance scanner treats status: 'archived' specially: it is not required to have any active Feature satisfying it (AP6 does not apply), and Features satisfying it are not required to have an active deliverable status.
This matters because deletion is destructive in a way that archival is not. Deleting a Requirement erases the historical relation; archival preserves it. The DSL's affordance here is that both are expressible, and the compliance gate differentiates them. A Requirement that is no longer relevant should be archived, not deleted. A Requirement that was never relevant (declared by mistake, discarded before approval) can be safely deleted.
The distinction rewards the discipline of treating the Requirement corpus as an append-mostly log, and punishes the shortcut of treating it as freely mutable. That shape is intentional.
What the git history looks like on this constraint
A practical consequence of the class-reference binding and the archival-not-deletion policy: the git history of the requirements/ directory becomes a readable record of what the product promised, in what order. Each commit that adds a Requirement file is a commit that records a business decision. Each commit that archives one is a commit that records a decision to deprecate. Each commit that renames one is — in the DSL, a compile-error-forcing event that touches every dependent Feature.
This is, in my experience, the single clearest operational artefact of the DSL. git log --follow requirements/req-discoverable-traceability.ts tells you, in one command, when the Requirement was declared, by whom, under what commit message, and what subsequent refinements or status changes it went through. No separate audit trail is needed because the source-controlled file is the audit trail.
A ticket-based tracker has the same information in principle — Jira has a change history per ticket — but the information is siloed in the tracker's database and requires an API call or a manual export to surface. The DSL's information is in the repo, in the tool every engineer already uses, correlated by commit with the code changes that enacted the decision. Correlation is automatic. The tracker's correlation is a separate manual step, or a bespoke integration, or absent entirely.
I mention this because it is the kind of benefit that is invisible from outside and routine from inside. Nothing in the DSL's pitch emphasises "audit trail is free". It is free because the tool's state lives in source control, and source control has audit trails as a built-in property. The DSL inherits that property by living where it lives. That inheritance is one of the quieter reasons the architecture works.
Anti-pattern 5 — requirement-less features
A Feature with no @Satisfies list:
export abstract class FeatureTraceExplorerTuiFeature extends Feature {
// no @Satisfies — which Requirement is this serving?
readonly id = 'FEATURE-TRACE-EXPLORER-TUI';
/* ... */
}export abstract class FeatureTraceExplorerTuiFeature extends Feature {
// no @Satisfies — which Requirement is this serving?
readonly id = 'FEATURE-TRACE-EXPLORER-TUI';
/* ... */
}This compiles. It may even have tests. It may even pass those tests. What it does not have is a reason. Nobody asked for this Feature. There is no Requirement in the registry that cites this Feature as its satisfier. It is a capability the team built because someone thought it was a good idea, without a recorded business ask.
I want to be careful here. I am not saying requirement-less Features are always wrong. Sometimes the team has intuition that outruns the tracker, and the Feature is built first and justified later. Chapter 01b (01b-historical-path-feature-first-requirement-later.md) is entirely about that sequence. The point is that a requirement-less Feature should be a known state that the system surfaces, not a silent default.
compliance --strict surfaces it. Phase 3 of the scanner lists every Feature whose @Satisfies decorator is empty or absent. The gate fails. The report names the Feature by ID. The developer is forced to either:
- add the Requirement that the Feature satisfies (the honest path, when the business ask can be articulated after the fact)
- mark the Feature as experimental via
@Exclude()with a reason (the honest path, when the Feature is a spike or prototype) - delete the Feature (the honest path, when the intuition turned out to be wrong)
All three are legitimate. None of them is silent. This is the distinction between capturing a state and hiding a state, and the DSL's job is the first, not the second.
Experimental features and the @Exclude() escape valve
The middle option above — marking a Feature as experimental via @Exclude() — deserves a closer look, because it is the part of the DSL most often misread as "an escape hatch to skip the gate".
It is not. @Exclude() takes a mandatory reason string, which the compliance report surfaces verbatim. The excluded Feature still appears in the report, in a dedicated Excluded section, with its reason and its exclusion date. The gate passes, but the report is not silent about the exclusion. Anyone reading the report sees exactly what was excluded and why.
This is the shape the DSL uses for spike branches, PoCs, and feature flags that ship code behind a runtime switch without yet having a Requirement story. The cost is not zero — the reason string has to be written, the exclusion is visible in the report, the next stakeholder meeting may ask about it — but the cost is paid once, at the moment of exclusion, not continuously as the Feature sits in an ambiguous half-tested state.
The anti-pattern is using @Exclude() without a reason, or with a placeholder reason like "TODO" or "WIP". The scanner flags empty or placeholder reasons as their own small anti-pattern (AP5b, if you like), failing the gate until the reason is real. This is a small detail, but it is the kind of small detail that determines whether the exclusion mechanism is an accountability tool or a silence tool. The DSL means it as the first.
An observation on the shape of @Exclude() reasons in practice
Reading the exclusion reasons across the project's history tells a story. The earliest exclusions were terse: "prototype", "spike", "investigation". As the project matured, the reasons lengthened: "held pending REQ-ACCESSIBILITY-AUDIT decision on keyboard navigation semantics", "deferred to Phase 7c pending schema migration in the refinement graph". The length of the reason correlates, roughly, with the depth of the consideration behind the exclusion.
This is not a metric I optimise for — there is no check on reason length — but it is an emergent property of the practice. A short reason often indicates a spike that should graduate or be deleted. A long reason indicates a Feature that has thought about its exclusion and is waiting on a specific event. Both are legitimate states; the length distinguishes them without requiring a separate status field.
I would not build policy around reason length. I do, however, use it as a self-diagnostic when auditing the project's exclusion list. A list with many short reasons is a list that needs pruning. A list with many long reasons is a list that is holding itself accountable. The DSL surfaces both; the judgement of what to do with either is the engineer's.
Anti-pattern 6 — feature-less requirements
The symmetric case. An approved Requirement with no Feature satisfying it:
@RequirementMeta({ status: 'approved', style: 'industrial' })
export class ReqDiscoverableTraceabilityRequirement extends Requirement {
/* no Feature in the repo lists this in its @Satisfies */
}@RequirementMeta({ status: 'approved', style: 'industrial' })
export class ReqDiscoverableTraceabilityRequirement extends Requirement {
/* no Feature in the repo lists this in its @Satisfies */
}This is the purer failure mode, because it is the one where the business asked, signed off, approved — and nothing was built. In a ticket-based system, this state is routine. Approved stories sit in the backlog for quarters. Nobody tracks which approved ones have zero linked Features, because the link is one-way (ticket → code) and nobody reads it from the other direction.
The DSL makes the reverse lookup the primary one. Chapter 07 (07-feature-requirement-many-to-many.md) describes the bidirectional registry: the scanner indexes @Satisfies in both directions, so Requirement → Features is as cheap as Feature → Requirements. Phase 3 of the compliance scan walks every status: 'approved' Requirement and asks: which Features satisfy this? If the answer is "none", the Requirement is feature-less, and compliance --strict fails.
This is, for me, the most important single check in the package. The asymmetry it corrects — code that nobody asked for is easier to notice than asks that nobody answered — is the asymmetry that ticket-based traceability most often gets wrong. Inverting it into a first-class report, gated by the build, is a large part of what the DSL is for.
The case from experience
A concrete case from the year before I started this package, on a project I will not name. The team had roughly eighty approved stories in the tracker. The tracker had a report called Stories Without Implementation. It was empty. Everyone was delighted.
The emptiness was a reporting artefact. The tracker's "has implementation" relation was computed by a regex match on commit messages: if any commit mentioned the story ID, the story was flagged as implemented. No check on whether the implementation satisfied the acceptance criteria. No check on whether the commit actually delivered the feature, or just mentioned the ID in a "WIP" context. Thirty-one of the eighty stories had no working implementation by the time the audit ran, and the tracker had been reporting zero gaps for six months.
The DSL's feature-less-Requirement check is the inverse of that regex. It does not ask "does anything mention this Requirement?"; it asks "is there a Feature class whose @Satisfies list includes this Requirement class?". The class has to exist, the import has to resolve, the decorator has to be present at a specific site, and the Feature has to have at least one passing AC. Every one of those conditions is a mechanical check. None of them is a regex. None of them can be gamed by writing the Requirement ID into a commit message.
Thirty-one false-negative reports over six months is, in my view, the shape of what happens when traceability is a report rather than a build condition. The DSL's position is that traceability should be the second, always.
Why this check is stricter than typed-specs was
Worth noting: typed-specs's compliance scanner had a version of the orphan check, but not the feature-less-Requirement check. The reason is simple — typed-specs did not model Requirements as a distinct type, so it could not check for them. The 1:1 collapse (AP7) meant every Feature was a Requirement, and the question "is there a Feature for every approved Requirement?" reduced to "is there a Feature for every approved Feature?", which is trivially true.
The present DSL's separation of Requirement and Feature makes the check non-trivial. An approved Requirement genuinely can lack a Feature satisfier, and the check genuinely can fail. That asymmetry — a check that is possible only because the two types are distinct — is one of the concrete ways the structural split pays off. Every structural claim in this series has a corresponding concrete benefit, and the feature-less-Requirement check is the benefit of the Requirement-stratum claim.
Anti-pattern 7 — the 1:1 collapse (the typed-specs trap)
The deepest anti-pattern, and the reason this entire series exists. In typed-specs/, the word Requirement appeared in every frontmatter, every title, every folder. But only one type was modelled: Feature. A Feature was a Requirement. One-to-one. Interchangeable.
This works, and works well, up to about twenty Features on a single project. Past that, the shape breaks:
- A Requirement like "the system must be accessible" is not a Feature. It is a cross-cutting constraint that forty Features must jointly satisfy. A 1:1 collapse cannot express this. You end up either (a) duplicating the Requirement into forty Feature-sized copies, or (b) giving up on Requirements at that granularity altogether.
- A Feature like "the TUI traceability explorer" satisfies two Requirements at once — discoverable traceability and dog-food. A 1:1 collapse forces a choice. Either you fork the Feature into two, or you under-report one of the Requirements it serves.
- The refinement graph disappears. Requirements should compose — a parent
REQ-ACCESSIBLErefines intoREQ-KEYBOARD-NAV,REQ-SCREEN-READER,REQ-CONTRAST. Features do not compose the same way; they are leaves, not graphs. Collapsing the two erases the composition layer.
The DSL closes this by making Requirement and Feature two distinct abstract classes with an explicit many-to-many relation between them. @Satisfies is a list, not a single value. Requirements have @Refines(ParentReq) that Features do not. The five axes of the level-up from typed-specs to @frenchexdev/requirements — covered in Chapter 00, summarised in Chapter 01 — all rest on this single structural split.
The cost is one extra abstract class. The benefit is the ability to express everything the 1:1 collapse cannot: cross-cutting constraints, many-to-many satisfaction, and the refinement graph. Every subsequent chapter of this series is, in some form, a consequence of that split.
The collapse in its most seductive form
There is a version of the 1:1 collapse that is particularly hard to resist, because it presents itself as elegance. The argument runs: why have two types when one would do? A Feature IS a Requirement in the sense that every Feature delivers a Requirement. Collapse them, simplify the model.
This is the argument typed-specs made, implicitly, by only modelling Feature. It is the argument most lightweight traceability tools make, and it is wrong for a specific reason: it conflates the grain at which the business asks with the grain at which the engineering delivers. Those grains differ.
The business asks at a cross-cutting level: the system must be accessible, data must not leak across tenants, the UI must respond within 200ms. Those are Requirements. The engineering delivers at a capability level: screen-reader support for the form component, row-level tenant isolation in the ORM, a render budget on the dashboard. Those are Features. A single Requirement typically requires multiple Features. A single Feature typically contributes to multiple Requirements. The grains differ because the stakeholders differ. Conflating them forces one of the two to distort into the other's shape, and the distortion is never clean.
The 1:1 collapse resolves the tension by pretending it does not exist. The resolution works locally — a single Feature really does satisfy a single Requirement at the smallest scale — and fails globally as soon as either side has any structure the other lacks. Cross-cutting Requirements have structure Features lack. Refinement-composed Requirements have structure Features lack. Many-to-many satisfaction has structure the 1:1 relation lacks. Each of these, missing, forces a distortion.
The DSL's decision to separate the two is not aesthetic minimalism's rival. It is the recognition that the two grains are genuinely different, and that a model that honours the difference will carry more information than one that collapses it. Every time I have been tempted to re-collapse — and I have been tempted more than once, because the simpler model looks cleaner on a whiteboard — the first real acceptance criterion that crosses Feature boundaries has rescued the decision.
The concrete case against the collapse, from this repo
A specific example from the present repo: REQ-DOG-FOOD is a cross-cutting Requirement. It says the package must be tested with itself, with zero describe / it. This single Requirement is satisfied jointly by — at last count — every test class in test/features/, every test class in test/integration/, and the entire AST scanner under src/scan/. No single Feature satisfies it. Every Feature contributes to it, partially.
Under a 1:1 collapse, REQ-DOG-FOOD would have to be either (a) duplicated into one copy per Feature, which would destroy its cross-cutting character by splitting it, or (b) dropped from the Requirement corpus, which would mean the package's most important structural invariant has no formal representation.
Neither option is acceptable. The first produces twenty-five feature-scoped copies of the same invariant, each of which drifts independently. The second means the dog-food invariant is invisible to the compliance report, which is especially ironic given that the invariant is that the report exists and is dog-fooded.
The separate-type model handles this cleanly: REQ-DOG-FOOD is one Requirement, declared once. It is satisfied jointly by @Satisfies(ReqDogFoodRequirement) on every Feature that contributes to it. The compliance report's Requirement → Features view shows a single Requirement with twenty-five-plus Feature satisfiers. The cross-cutting shape is preserved. The information is whole.
This is the kind of example that is easy to miss when the Feature set is small. On a five-Feature project, every Requirement happens to be 1:1, and the collapse feels harmless. The moment the project acquires a genuine cross-cutting Requirement — accessibility, performance, security, dog-food — the collapse breaks. Every project I have been on eventually acquires such a Requirement. Every one. The 1:1 collapse is therefore not a valid simplification for any project that will outlive its MVP.
The symmetry with typed-specs, read as a before-picture
It is worth rereading typed-specs one more time, at this stage of the series, with the 1:1 collapse in mind. The things typed-specs could not express are visible in hindsight as the shape of the absent type:
- Cross-cutting Requirements like "the system must handle 10,000 concurrent users" have no home in typed-specs. They live in Feature-shaped fragments or not at all.
- Refinement relations like "keyboard navigation refines accessibility" have no home. Features are leaves; they do not refine each other in the Requirement sense.
- Many-to-many satisfaction has no home. A Feature typed-specs-style satisfies exactly one thing: itself.
- Style plurality — the five rhetorical registers for a single Requirement — has no home. Features have one description, not five.
Every one of these absences is the same absence: the Requirement type. The present DSL introduces it, and every chapter from 01 onward is, in some form, exploring what that one type makes possible. This chapter's contribution is to name the absences explicitly, as the AP7 trap, and to give them the same structural status as the other six anti-patterns. The seven together form the negative image of what the DSL positively models.
When NOT to use this DSL
The seven anti-patterns above are the case for the DSL. The case against is just as important. I have seen developers adopt traceability tooling out of guilt — because they read a blog post, felt the absence, and installed the tool — and then resent it six weeks later because the tool's ceremony outran the project's complexity. That resentment is a real signal, and the honest response is a list of contexts in which the DSL is the wrong answer.
Small CLI tools
A two-hundred-line CLI that wraps a single curl call does not need a Requirement stratum. The entire Feature surface is one command; the acceptance criterion is "does the command exit zero when the API returns 200". Writing a Requirement class, a Feature class, a @Satisfies decorator, and a compliance gate for this is ceremony without benefit. A one-line shell script with an assertion is the right tool.
The rough heuristic I use: if the Feature list is shorter than the scaffolding required to declare it — fewer than roughly five Features, fewer than roughly fifteen total acceptance criteria — the DSL is over-engineered. The inflection point is not sharp, but below it the ratio is wrong.
A second signal: if the team can keep the entire Feature list in one person's head without writing it down, the list is short enough not to need the DSL. The DSL's value kicks in when the list outgrows human memory — when a developer joining the team in month six genuinely cannot recover the shape of the spec without reading the Requirement classes. Below that threshold, the Requirement classes are ceremony the team must maintain with no corresponding reader. Ceremony without reader is pure cost.
A third signal, related but distinct: if there is no reviewer, no stakeholder, no future-self who will ever read the compliance report, the report has no value. The DSL produces reports for an audience. If the audience is empty — a solo hobbyist with no external accountability, a one-week hackathon with no post-mortem — the report is performative. Performative compliance is the exact thing the DSL is meant to resist, and declining to adopt it in contexts where it would be performative is part of being honest about its purpose.
A concrete example of a project where I correctly chose not to adopt the DSL: a single-purpose static-site generator for a friend's portfolio. Three routes, four components, twelve total lines of business logic. The acceptance criteria were effectively "the pages render, the links work, the build deploys". I wrote three Playwright tests, committed, deployed, and moved on. No Requirement classes, no Feature declarations, no compliance gate. The project is still live three years later, has never regressed, and required approximately zero maintenance. The DSL would have added weeks of scaffolding to a project that lived on its initial simplicity.
I mention this to counter a possible misreading: the author of this DSL uses it on everything he writes. I do not. I use it on projects where the complexity and longevity justify it. The portfolio site did not. The @frenchexdev monorepo, the CMF requirements DSL, the metacratie compiler project all do. The line between "justifies the DSL" and "does not" is a judgement call, but it is a genuine one, and I make it more often on the "does not" side than this chapter's framing might imply.
Throwaway scripts
A data-migration script that runs once, then gets deleted, should not be gated by a compliance report. The script's acceptance criterion is "the migration completed without errors on the production data". That is a single runtime check, not a type-level structure. Adding a Feature class around it adds weeks of maintenance to something that will live for days.
The same applies to exploratory notebooks, spike branches, and PoCs whose only deliverable is a decision ("does this library work for our case?"). The cost of declaring a Feature class is justified when the Feature will be referenced repeatedly — in tests, in compliance reports, in later refinement. A spike that answers a yes/no question and is discarded will never be referenced again. The DSL has no purchase on it.
The subtlety: a spike that survives should acquire a Feature class the moment it graduates from spike to kept code. The transition is a real engineering event — usually a PR titled "promote spike to feature" — and the DSL's adoption at that moment is correctly timed. The cost is paid once, at the moment the code becomes worth declaring. Paying it before is premature; paying it after is delayed, but the DSL does not prevent either.
The shape I have settled on, across a few projects: spike branches use plain tests or even no tests at all, and Feature declaration is a deliberate act at the point of merge. This respects both grains — exploratory code moves fast and has no scaffolding, kept code moves carefully and has full declaration — without forcing one grain to distort into the other's shape.
UI-only apps with no invariants
An application whose value is purely aesthetic — a marketing site, a static landing page, a visual explainer — has no invariants for the DSL to track. There is no acceptance criterion that compiles; there is a design comp, a stakeholder's approval, a Lighthouse score. Visual regression testing, accessibility audits, and performance budgets are the right tools. @FeatureTest on a React component that renders a hero section is decoration, not traceability.
The sharper distinction: if the Feature has no verifiable behavioural acceptance criterion, only a visual one, the DSL is the wrong shape. The ACResult type is a boolean-ish structured result. Visual acceptance is a human judgement, possibly automated via pixel-diff, but not expressible as a typed AC method.
A mixed-case is common: a product that has behavioural invariants and visual ones. A data-dashboard, for example, has behavioural ACs ("filters correctly cross-reduce", "drill-downs preserve state") and visual ACs ("the chart looks right at 1280px width"). The DSL's position on this is pragmatic: model the behavioural side, leave the visual side to visual-regression tools, and do not try to force the visual side into the type system. The Feature declaration covers what it can; a separate Playwright visual-snapshot harness covers what it cannot. Two tools, clear division of labour, no forced uniformity.
This matters because the temptation to force uniformity — "every acceptance criterion must be in the DSL" — produces one of two failures. Either the DSL acquires ad-hoc extensions for visual acceptance that never quite work (pixel-diff in a typed AC is awkward at best), or visual ACs are declared as typed ACs and their implementations are no-op passes that report success regardless of actual visual drift. Both failures are worse than accepting the division of labour explicitly.
Pre-existing heavy Jira (or Rally, or Azure DevOps) investment
A team that has ten years of tickets in Jira, with epics, stories, sub-tasks, custom fields, automations, and executive dashboards all wired into that tracker, cannot cleanly migrate to an in-repo Requirement DSL. The DSL's gate is that every Requirement be declared as a TypeScript class in the repo. If your Requirements are in Jira, and will remain in Jira because the CFO reads the Jira dashboard, the DSL becomes a second source of truth, and second sources of truth rot.
There are integration paths — a Jira importer that materialises approved issues as Requirement classes, a webhook that fails the Jira state transition unless the corresponding class exists in the repo — but those are multi-month engineering investments, not a pnpm install. Until that integration is built, the DSL and Jira are rivals, and Jira wins on institutional inertia.
The honest advice for this context: either commit to the migration (scope it as a six-month programme, plan the Jira cutover), or do not adopt the DSL at all. A half-migration is worse than either pole.
I say this from experience. On one project, an enthusiastic developer installed a traceability tool and began declaring Requirements in code. Jira continued to be the official tracker. For three months, the two systems ran in parallel. The developer's Requirements drifted from Jira's as priorities shifted in the tracker but not in the code. The compliance report stopped being useful because it reported against a snapshot of Jira that was months stale. The developer left the team; the Requirements in code were quietly deleted at the first refactor, because nobody else was invested in them. The cost was real, the benefit evaporated, and the institutional memory recorded "traceability tools do not work here" — which was exactly the wrong conclusion, but a conclusion reached honestly from the evidence.
The DSL's position on this context is not to compete with Jira. It is to be silent about Jira. If Jira is the source of truth, stay with Jira, or migrate fully. Do not install the DSL as a second competing source. The institutional lesson from a failed half-migration is worse than the cost of living without the DSL.
Teams unwilling to dog-food
The DSL's case for itself rests on meta-circularity. REQ-DOG-FOOD says: the package must be tested with itself; zero describe, zero it. A team that reads this as "cute gimmick" rather than "structural discipline" will not get the benefit. They will use the decorators as annotations, ignore the compliance gate, and describe the result as "bureaucracy" when the first false positive arrives.
The DSL's value compounds with use. Every Requirement declared, every Feature bound, every compliance report read strengthens the traceability graph. A team that declines to engage with the graph — who treats the decorators as documentation rather than types — will accumulate the cost without accumulating the benefit.
This is the hardest of the five to diagnose before adoption. My best proxy: if the team cannot articulate, in their own words, why dog-fooding matters — not the abstract principle, but the specific decisions it forces — they are probably not ready. A six-week trial on one feature area, with no commitment to expansion, is a safer starting position than a project-wide rollout.
The trial's success criteria should be explicit from the outset: at the end of six weeks, is the team consulting the compliance report before PRs land? Have they disagreed with the report in a way that led to a Requirement being refined rather than the gate being disabled? Has someone — anyone — cited a Requirement class by name in a design discussion? If the answers are yes, the trial has surfaced real adoption; if not, the DSL has not taken root, and forcing it further will produce the resentment described above. The trial is a measurement, not a proof of concept. Treat it as one.
The inverse question — how do you know the DSL has taken root? — has a sharp answer: when team members start proposing new Requirements before writing Features, rather than declaring Requirements retroactively to justify already-built Features. That reversal of causality is the marker of adoption. It is also, not coincidentally, the marker of the DSL being used for its intended purpose rather than as after-the-fact documentation.
Summary — the five contexts
| Context | Why the DSL is wrong | Right tool |
|---|---|---|
| Small CLI (< 5 Features) | Ceremony outruns content | Plain tests, assertion library |
| Throwaway scripts | Lives shorter than the scaffolding | Runtime check, no declaration |
| UI-only, no invariants | ACs are visual, not behavioural | Visual regression, a11y audit |
| Heavy Jira investment | Second source of truth | Jira + bespoke integration, or commit to migration |
| Team unwilling to dog-food | Cost accrues, benefit does not | Lighter annotation system, or none |
None of these is a failure of the DSL. They are the shape of its boundary. A tool that claims to solve every context is a tool whose boundary is hidden; knowing where it stops is part of knowing what it is.
What makes a context a good fit, positively stated
To close the boundary question symmetrically, the positive characterisation is worth naming. The DSL fits when:
- The Feature list is large enough that memory is not sufficient. Roughly ten Features upward, more than thirty total acceptance criteria, enough cross-cutting concerns that at least one Requirement covers multiple Features.
- The code is kept, not thrown away. The project has a roadmap measured in months or years, not days. The Requirement classes are worth their maintenance cost because the corpus will be referenced repeatedly.
- The acceptance criteria are behavioural. Typed method signatures, returning
ACResult, are the right shape for what the product promises. Visual, aesthetic, or judgement-based criteria are handled by complementary tools — but the DSL's grain is behavioural. - The team will dog-food. The compliance report is read. Gates that fail are investigated, not disabled. New Requirements precede new Features more often than not.
- There is no competing tracker with higher institutional weight. Either there is no Jira, or Jira is understood to be operational (sprint boards, standups) rather than architectural (the source of truth for what the product promises).
When four of these five hold, the DSL's cost-benefit ratio is favourable. When three hold, the ratio is marginal and a trial is prudent. When two or fewer hold, the DSL is probably the wrong tool, and one of the five contexts above is the right frame for the conversation.
The case for adopting incrementally
If the decision is to adopt, the question becomes how aggressively. My recommendation, based on two prior adoptions: start narrow and deepen gradually.
A sensible starting scope is one feature area — a single subsystem, roughly five to ten Features — with full declaration: Requirements, Features, @Satisfies, @FeatureTest, @Verifies, compliance gate. The rest of the codebase remains on the existing testing regime. This is a self-contained beachhead: the DSL's value is visible within the chosen area, and the wider codebase is not disrupted.
After six to eight weeks, the beachhead will have produced signal: either the team is reaching for the DSL's affordances (looking at the compliance report, declaring Requirements before Features, refining the refinement graph), or they are treating it as imposed scaffolding (writing the decorators because the gate requires them, but not consulting the report). The first signal justifies expansion; the second signals that the DSL is not taking root, and the beachhead should be wound down.
Assuming expansion, the next scope is a related feature area — one that shares Requirements with the beachhead, so the many-to-many relation comes into play. This is where the DSL's distinctive value over a 1:1 tracker becomes visible: a Requirement that was satisfied by one Feature in the beachhead is now satisfied by two, in different areas, and the bidirectional registry shows both. That affordance is what the DSL has that string-based trackers lack. Seeing it in practice, on real work, is often the moment adopters understand why the many-to-many split matters.
From there, expansion continues at the team's pace. Some teams reach full coverage within two quarters. Some plateau at 60-70% coverage of the codebase, with carved-out exceptions for lower-value areas. Both outcomes are fine. The DSL does not require totality to be useful; partial adoption, correctly scoped, delivers most of the value with a fraction of the effort.
Running-example recap
The running example, as ever, is FEATURE-TRACE-EXPLORER-TUI. Under a describe/it regime, it would have exhibited at least three of the seven anti-patterns at once:
- AP1 (drift). The
describe('trace explorer', () => ...)block would have drifted from the Feature's actual name as the TUI design evolved. It did evolve — the Feature was called trace browser for a week before renaming — and a string-based title would have survived unchanged. - AP3 (orphan). Three of the integration ACs (
traceExplorerUsesFileSystemPortForDiscovery,traceExplorerUsesPromptPortForInteraction,endToEndNavigatesReqToFeatToAcToTest) landed in separate test files during early development. Without@FeatureTest(FeatureTraceExplorerTuiFeature)at the top of each, they would have been orphans — tests that ran, passed, and claimed no Feature coverage. - AP7 (1:1 collapse). Most critically, the Feature satisfies three Requirements:
REQ-DISCOVERABLE-TRACEABILITY,REQ-DOG-FOOD, andREQ-PARALLEL-DELIVERABLE. In a typed-specs 1:1 regime, two of those three would have been lost. The Feature would have been named for one Requirement — probably discoverability, since that is the user-visible one — and the dog-food and parallel-deliverable Requirements would have been either undeclared or duplicated into Feature-shaped copies.
The present DSL rules out all three at the type level. @FeatureTest(FeatureTraceExplorerTuiFeature) is a class reference; no drift. Every test method is in a class that declares @FeatureTest; no orphans. @Satisfies is a variadic list of Requirement classes; no collapse. The Feature can claim three Requirements, and the compliance report indexes all three bidirectionally.
What the example would have looked like under each anti-pattern
To make the contrast sharper, here is the same Feature as it would have been written in each of the failure regimes:
Under AP1 (drift) — a describe/it shape:
describe('trace explorer', () => {
describe('navigation', () => {
it('builds the graph', () => { /* ... */ });
it('handles arrow keys', () => { /* ... */ });
it('drills down', () => { /* ... */ });
it('opens help', () => { /* ... */ });
it('jumps back', () => { /* ... */ });
});
// ...
});describe('trace explorer', () => {
describe('navigation', () => {
it('builds the graph', () => { /* ... */ });
it('handles arrow keys', () => { /* ... */ });
it('drills down', () => { /* ... */ });
it('opens help', () => { /* ... */ });
it('jumps back', () => { /* ... */ });
});
// ...
});No reference to any Feature. The word "trace explorer" is a string. Rename the Feature to "trace browser" and this file keeps the old name forever.
Under AP2 (string-magic) — a decorator but with string args:
@FeatureTest('FEATURE-TRACE-EXPLORER-TUI')
class TraceExplorerTest {
@Verifies('traceExplorerBuildsGraph')
traceExplorerBuildsGraph() { /* ... */ }
// ...
}@FeatureTest('FEATURE-TRACE-EXPLORER-TUI')
class TraceExplorerTest {
@Verifies('traceExplorerBuildsGraph')
traceExplorerBuildsGraph() { /* ... */ }
// ...
}The Feature ID is a string; rename the Feature and nothing catches it. The AC name is a string; rename the method and nothing catches it. Two drift vectors, both silent.
Under AP7 (1:1 collapse) — typed-specs shape:
@FeatureTest(FeatureTraceExplorerTuiFeature)
class FeatureTraceExplorerTuiTest extends FeatureTraceExplorerTuiFeature {
@Verifies<FeatureTraceExplorerTuiFeature>('traceExplorerBuildsGraph')
traceExplorerBuildsGraph() { /* ... */ }
// ...
}@FeatureTest(FeatureTraceExplorerTuiFeature)
class FeatureTraceExplorerTuiTest extends FeatureTraceExplorerTuiFeature {
@Verifies<FeatureTraceExplorerTuiFeature>('traceExplorerBuildsGraph')
traceExplorerBuildsGraph() { /* ... */ }
// ...
}The test side is clean — typed-specs got this right. But the Feature side has no @Satisfies decorator, because Requirements do not exist as a separate type. The three Requirements FEATURE-TRACE-EXPLORER-TUI satisfies — discoverability, dog-food, parallel-deliverable — have no formal home. They live in prose, in commit messages, in frontmatter tags. They do not live in the type system.
Under the present DSL:
@Satisfies(
ReqDiscoverableTraceabilityRequirement,
ReqDogFoodRequirement,
ReqParallelDeliverableRequirement,
)
export abstract class FeatureTraceExplorerTuiFeature extends Feature { /* ... */ }
@FeatureTest(FeatureTraceExplorerTuiFeature)
export class FeatureTraceExplorerTuiTest extends FeatureTraceExplorerTuiFeature {
@Verifies<FeatureTraceExplorerTuiFeature>('traceExplorerBuildsGraph')
traceExplorerBuildsGraph() { /* ... */ }
// ...
}@Satisfies(
ReqDiscoverableTraceabilityRequirement,
ReqDogFoodRequirement,
ReqParallelDeliverableRequirement,
)
export abstract class FeatureTraceExplorerTuiFeature extends Feature { /* ... */ }
@FeatureTest(FeatureTraceExplorerTuiFeature)
export class FeatureTraceExplorerTuiTest extends FeatureTraceExplorerTuiFeature {
@Verifies<FeatureTraceExplorerTuiFeature>('traceExplorerBuildsGraph')
traceExplorerBuildsGraph() { /* ... */ }
// ...
}Every binding is a class reference or a keyof T string. Every drift vector from the earlier regimes is closed by a type. The three Requirements are surfaced, indexed, and available to the compliance report. This is the same Feature, written in the shape the DSL permits, with no anti-pattern.
The progression from the describe/it version to the present-DSL version is, roughly, the progression of the whole series. Each regime closes one or two anti-patterns while leaving others open. The present DSL is the first regime in this progression that closes all seven. That is the claim the chapter is meant to make concrete.
What remains open — and why the DSL does not close it
The seven anti-patterns covered here are the ones the DSL closes by type. There are adjacent failure modes the DSL does not close, and being explicit about those is part of the honest accounting.
The DSL does not close semantic drift — the case where the Feature's AC method is named correctly, the binding is correct, the test passes, but the test's assertions do not actually verify what the AC claims. If traceExplorerBuildsGraph() asserts only that the function returns without throwing, the binding is valid, the name matches, the type system is satisfied, and the AC is still not verified in any meaningful sense. This is a failure of the test's internal content, not of its structural binding. The DSL can ensure that a test exists for every AC; it cannot ensure that the test actually verifies the AC. That remains a human reviewer's responsibility.
The DSL does not close Requirement-content drift — the case where the Requirement class exists, its refinements are correct, its Features satisfy it, but the Requirement's description does not match what the business actually asked for. The DSL can enforce that the Requirement is formally wired in; it cannot enforce that the Requirement's description is faithful to the stakeholder's intent. That remains a conversation between the engineer who wrote the description and the stakeholder who approved it.
The DSL does not close priority drift — the case where a Feature's priority field is set to Priority.Low but the business considers it critical, or vice versa. The field is a value; the DSL records it but does not validate it against any external source. Priority-accuracy is, like Requirement-content, a conversational artefact that lives outside the type system.
Naming these adjacencies matters because otherwise the DSL's claims read as larger than they are. The DSL closes seven specific anti-patterns by type. It does not claim to close every failure mode in traceability. The residual failure modes — semantic drift, Requirement-content drift, priority drift — remain human-shaped problems that the DSL's structural fixes do not reach. Honesty about this is part of what makes the claims about the seven credible.
This is the shape of the case I have been making across nineteen chapters. The DSL's constructive side — what it lets you model — is well-covered in Chapters 00 through 18. This chapter's destructive side — what it refuses to let you model — is, in some sense, the same claim read from the other direction. A tool is known by its boundary. The seven anti-patterns here, plus the five contexts in which the DSL itself is the wrong answer, are that boundary.
A closing note on the seven-and-five structure
The chapter's structure — seven anti-patterns, five mechanisms, five wrong-context caveats — is not an accident of presentation. The seven anti-patterns are the ones a reader is likely to have encountered in the field; naming them gives the DSL specific traction. The five mechanisms are the abstract categories the anti-patterns fall into; naming them gives the reader a generalisable lens that outlasts any specific tool. The five wrong-context caveats are the places where the DSL's benefits do not exceed its costs; naming them prevents the chapter from reading as a pitch.
Seven plus five plus five is twelve human-readable items. That is, I think, roughly the upper limit of what a reader can retain from a single chapter without a notebook. The chapter's job is to seat all twelve firmly enough that the reader can recognise each one in the wild, which is why each item has a dedicated sub-section rather than being compressed into a table. Tables are for reference; prose is for recognition. This chapter aims at recognition.
One last observation, on what the boundary says about the DSL
The seven anti-patterns have a common shape. Each of them is a place where strings should have been types, or where implicit policies should have been explicit checks, or where silent defaults should have been surfaced states. The DSL's pattern of response is, in every case, the same: push the check earlier (from runtime to compile-time, from report-time to build-time, from policy to type), and make the check mechanical rather than judgemental.
This is not an accident of this particular package. It is the general shape of the argument for any DSL over ad-hoc annotation. The DSL earns its weight when the weight it adds to the declaration is less than the weight it subtracts from the checking. In describe/it land, every rename is a judgement call, every test file's binding is a policy, every compliance report is a cross-reference that someone has to trust. In @frenchexdev/requirements land, each of those is a mechanical check the scanner runs in milliseconds. The DSL is heavier on declaration and lighter on verification. The ratio of those two weights is what determines whether the DSL is worth adopting — and the ratio improves with scale.
Above the threshold where the ratio favours the DSL, the seven anti-patterns disappear one by one as the type signatures are adopted. Below the threshold, the ratio is reversed, and the five contexts above apply. The boundary is not ideological; it is arithmetic. I find that a more honest framing than any abstract pitch for "better traceability", and it is the framing this chapter is meant to leave the reader with.
What the next chapter takes up
Chapter 20, the next in the series, turns from this chapter's retrospective stance to a prospective one: the roadmap for the DSL itself and the wider ecosystem it is intended to fit into. Where this chapter catalogues what the DSL closes today, the next catalogues what it aims to close tomorrow — the Phase 7c work around the trace-explorer TUI, the integration with the wider @frenchexdev monorepo, the eventual cross-language bindings that would let the DSL serve C# and OCaml teams as cleanly as it serves TypeScript teams today.
I mention this because the two chapters are intended as a pair. This chapter's boundary — the seven anti-patterns closed, the five contexts where the DSL does not fit — is not a permanent boundary. The roadmap in the next chapter describes how the boundary is intended to move. Some of the current limitations (the weaker enforcement on copy-paste drift, the absence of semantic-drift detection, the missing integrations with external trackers) are candidates for future closure. Others (the unsuitability for visual-only apps, the performative-compliance concern for hobbyist projects) are intended to remain boundaries permanently.
Knowing which limitations are temporary and which are structural is part of what this series is for. The DSL is not a finished object; it is a point on a trajectory. The next chapter names the trajectory; this one names the current position.
Related Reading
- typed-specs/05-decorators.md — the original
keyof Tdecorator pattern that closes AP2 (string-magic AC refs). The direct ancestor of every type-level fix in this chapter. - 00-named-but-not-modelled.md — the full diagnostic of AP7 (1:1 collapse), read against the earlier typed-specs corpus. The structural reason this series exists.
- 13-quality-gates-and-compliance.md — the scanner phase that reports AP3 (orphan tests), AP5 (requirement-less Features), and AP6 (feature-less Requirements). The gate that turns each policy into a build condition.
- 07-feature-requirement-many-to-many.md — the registry whose reverse-index (Requirement → Features) makes AP6 detection cheap, and whose forward-index makes AP4 (stale
@Satisfies) impossible at compile time. - 13c-when-compliance-fails-diagnostics-and-recovery.md — what to do when the gate fails on one of these anti-patterns. The operational companion to this chapter's taxonomy.
← Previous: Chapter 18 — The Parallel-Deliverable Axis · Next: Chapter 20 — Tooling Roadmap and Ecosystem Outlook →