Skip to main content
Welcome. This site supports keyboard navigation and screen readers. Press ? at any time for keyboard shortcuts. Press [ to focus the sidebar, ] to focus the content. High-contrast themes are available via the toolbar.
serard@dev00:~/cv

Feature Traceability and the Quality Gate Chain

Phase 1 inferred transitions. Phase 2 extracted the full graph. Phase 3 rendered it as interactive SVG. But none of those phases answered the question that a project manager, a product owner, or a future maintainer cares about most: why does this machine exist?

A machine without a purpose is a machine that will rot. Someone will hesitate to delete it because they do not know what it does. Someone else will refactor around it because they cannot tell what depends on it. The machine will accumulate workarounds. Its states will drift from reality. Its tests will be marked skip. Eventually it becomes the code equivalent of a mystery wire behind the wall — nobody dares touch it, nobody knows if it carries current.

The answer is traceability. Every machine must declare what business feature it implements and which acceptance criterion it addresses. That declaration must be verifiable — not by human inspection, but by automated tooling that fails the build when a machine is orphaned.

This part covers the traceability system from the declaration to the enforcement: FsmFeatureLink in the decorator, the requirements directory with Feature abstract classes, the audit-fsm-feature-links.ts orphan detector, and the full quality gate chain that runs before every deploy. It then shifts to the testing side: how to test typed events, pure state machines, coordinators, and property-based invariants — all without a browser, all without mocking, all under 98%+ coverage gates.

The @FiniteStateMachine decorator (Part III) carries an optional feature field:

@FiniteStateMachine({
  states: ['idle', 'loading', 'rendering', 'postProcessing', 'done', 'error'] as const,
  events: ['startLoad', 'markRendering', 'markPostProcessing', 'markDone', 'markError'] as const,
  transitions: [
    { from: 'idle',           to: 'loading',        on: 'startLoad' },
    { from: 'loading',        to: 'rendering',      on: 'markRendering' },
    { from: 'rendering',      to: 'postProcessing', on: 'markPostProcessing' },
    { from: 'postProcessing', to: 'done',           on: 'markDone' },
    { from: '*',              to: 'error',           on: 'markError' },
  ],
  emits: ['app-ready', 'toc-headings-rendered'] as const,
  feature: { id: 'PAGE-LOAD', ac: 'fullLifecycle' } as const,
  scope: 'scoped',
})
class PageLoadStateFsm {}

The feature field is an FsmFeatureLink:

export interface FsmFeatureLink {
  readonly id: string;
  readonly ac: string;
}

Two fields. Two strings. But these two strings close the loop between code and requirements:

  • id identifies the Feature. It matches a Feature abstract class in the requirements directory: Feature_PAGE_LOAD. The naming convention is deterministic — replace hyphens with underscores, prefix with Feature_.

  • ac identifies the acceptance criterion within that Feature. It matches an abstract method on the Feature class: abstract fullLifecycle(): void. The compiler ensures the method exists because ac is used as a keyof lookup.

The as const on the feature literal is not decorative. It preserves the string literals 'PAGE-LOAD' and 'fullLifecycle' as literal types rather than widening them to string. The build-time extractor (Part IX) reads these literals from the AST — if they were widened, the extractor would see string and could not resolve the feature reference.

The link says: "This machine exists because the PAGE-LOAD feature requires a fullLifecycle acceptance criterion to be satisfied, and this machine is the implementation of that criterion."

That is a testable claim. The compliance scanner can verify:

  1. That Feature_PAGE_LOAD exists as a class in the requirements directory.
  2. That fullLifecycle exists as an abstract method on that class.
  3. That there is at least one test with a @Verifies annotation pointing to the same feature and AC.
  4. That the test actually exercises the machine's source file.

If any of those four checks fail, the machine's traceability chain is broken. The scanner reports it. The build gate blocks.

As of this writing, 40 of 43 machines carry feature links. The three exceptions are infrastructure machines:

Machine Why No Feature Link
EventBusFsm Infrastructure — the bus itself is not a user feature
HotReloadActionsFsm Developer tooling — only active in dev mode
DevWatcherFsm Developer tooling — file watcher for hot-reload

These three are exempt by convention. The audit script knows their IDs and skips them. If a fourth machine were added without a feature link and without being listed in the exemption set, the audit would fail.

The Requirements Directory

Feature abstract classes live in requirements/features/. Each class follows a strict pattern:

// requirements/features/page-load.ts

export abstract class Feature_PAGE_LOAD {
  readonly id = 'PAGE-LOAD' as const;

  /** The full page load lifecycle: idle → loading → rendering → postProcessing → done. */
  abstract fullLifecycle(): void;

  /** Recovery from error state via retry. */
  abstract errorRecovery(): void;

  /** Stale generation detection — ignoring results from superseded navigations. */
  abstract staleGenerationGuard(): void;
}

Three things to notice:

1. The class is abstract. It can never be instantiated. It exists purely as a type-level contract — a list of acceptance criteria that must be implemented and verified.

2. Each abstract method is an acceptance criterion. The method name (fullLifecycle, errorRecovery, staleGenerationGuard) is the AC identifier. The JSDoc above it is the AC description. The method signature is always (): void — these methods carry no logic, only identity.

3. The id field matches the feature link. When a machine declares feature: { id: 'PAGE-LOAD', ac: 'fullLifecycle' }, the id resolves to this class and the ac resolves to the fullLifecycle abstract method. The resolution is deterministic: idFeature_${id.replace(/-/g, '_')} → class lookup → ac → method lookup.

The Compiler as First Validator

The type system does not directly validate the feature field in the decorator — the field uses string types (see Part III for why). But the Feature abstract class itself is validated by the compiler:

// This compiles — all ACs are declared
export abstract class Feature_PAGE_LOAD {
  readonly id = 'PAGE-LOAD' as const;
  abstract fullLifecycle(): void;
  abstract errorRecovery(): void;
}

// This also compiles — but notice no AC for staleGenerationGuard
// The compiler does not complain because it's an abstract class,
// not an implementation. The gap is caught by the compliance scanner.

The compiler ensures that every AC name used in @Verifies<Feature_PAGE_LOAD>('fullLifecycle') is a real method on the class — TypeScript rejects @Verifies<Feature_PAGE_LOAD>('nonexistent') because 'nonexistent' is not keyof Feature_PAGE_LOAD. This is the first validation layer: the type system catches AC name typos at compile time.

Feature Directory Structure

requirements/
  features/
    page-load.ts          → Feature_PAGE_LOAD
    accent-palette.ts     → Feature_ACCENT_PALETTE
    terminal-dots.ts      → Feature_TERMINAL_DOTS
    tour.ts               → Feature_TOUR
    spa-navigation.ts     → Feature_SPA_NAVIGATION
    scroll-spy.ts         → Feature_SCROLL_SPY
    toc-breadcrumb.ts     → Feature_TOC_BREADCRUMB
    sidebar.ts            → Feature_SIDEBAR
    theme.ts              → Feature_THEME
    topbar-search.ts      → Feature_TOPBAR_SEARCH
    copy-feedback.ts      → Feature_COPY_FEEDBACK
    mermaid-render.ts     → Feature_MERMAID_RENDER
    ...                   → (23 feature classes total)
  index.ts                → re-exports all features

Each file contains exactly one Feature abstract class. The index.ts barrel re-exports them all. The audit script imports from this barrel and uses reflection to enumerate all Feature classes.

Concrete Chain: The ACCENT Feature

Abstract explanations are useful. Concrete examples are better. Here is the full traceability chain for the accent palette feature, from the Feature class to the test that closes the loop.

Step 1: Feature Class

// requirements/features/accent-palette.ts

export abstract class Feature_ACCENT_PALETTE {
  readonly id = 'ACCENT' as const;

  /** Right-clicking the theme toggle opens the accent palette. */
  abstract rightClickOpensPalette(): void;

  /** Selecting a swatch applies the accent color and closes the palette. */
  abstract swatchAppliesAccent(): void;

  /** Clicking outside the palette closes it. */
  abstract outsideClickCloses(): void;
}

Three acceptance criteria. Three abstract methods.

Step 2: Machine Decorator

// src/lib/accent-palette-state.ts

@FiniteStateMachine({
  states: ['closed', 'open', 'applying'] as const,
  events: ['openPalette', 'selectSwatch', 'applyDone', 'closePalette'] as const,
  transitions: [
    { from: 'closed',   to: 'open',     on: 'openPalette' },
    { from: 'open',     to: 'applying', on: 'selectSwatch' },
    { from: 'applying', to: 'closed',   on: 'applyDone' },
    { from: 'open',     to: 'closed',   on: 'closePalette' },
  ],
  feature: { id: 'ACCENT', ac: 'rightClickOpensPalette' } as const,
  scope: 'singleton',
})
class AccentPaletteStateFsm {}

The machine links to ACCENT / rightClickOpensPalette. This means: "The accent palette state machine exists because the rightClickOpensPalette acceptance criterion requires it."

Step 3: Test with @Verifies

// test/unit/accent-palette-state.test.ts

import { Feature_ACCENT_PALETTE } from '../../requirements/features/accent-palette';

describe('AccentPaletteState', () => {
  /** @Verifies<Feature_ACCENT_PALETTE>('rightClickOpensPalette') */
  it('opens the palette on right-click trigger', () => {
    const machine = createAccentPaletteState({ onOpen: vi.fn(), onClose: vi.fn() });
    machine.openPalette();
    expect(machine.state()).toBe('open');
  });

  /** @Verifies<Feature_ACCENT_PALETTE>('swatchAppliesAccent') */
  it('applies accent color on swatch selection', () => {
    const onApply = vi.fn();
    const machine = createAccentPaletteState({ onOpen: vi.fn(), onClose: vi.fn(), onApply });
    machine.openPalette();
    machine.selectSwatch();
    expect(machine.state()).toBe('applying');
    expect(onApply).toHaveBeenCalledOnce();
  });

  /** @Verifies<Feature_ACCENT_PALETTE>('outsideClickCloses') */
  it('closes the palette on outside click', () => {
    const onClose = vi.fn();
    const machine = createAccentPaletteState({ onOpen: vi.fn(), onClose });
    machine.openPalette();
    machine.closePalette();
    expect(machine.state()).toBe('closed');
    expect(onClose).toHaveBeenCalledOnce();
  });
});

Three tests. Three @Verifies annotations. Each annotation references the Feature class and one of its abstract methods. The @Verifies tag is a JSDoc annotation that the compliance scanner reads via AST walking — it resolves the import, finds the Feature class, and confirms that the AC name is a valid method.

Step 4: The Scanner Closes the Loop

The compliance scanner (integrated into audit-fsm-feature-links.ts) builds the full traceability matrix:

Feature_ACCENT_PALETTE (ACCENT)
  AC: rightClickOpensPalette
    Machine:  AccentPaletteStateFsm  (src/lib/accent-palette-state.ts)   ✓
    Test:     accent-palette-state.test.ts:8                              ✓
  AC: swatchAppliesAccent
    Machine:  (no direct link — AC covered by same machine)               ~
    Test:     accent-palette-state.test.ts:15                             ✓
  AC: outsideClickCloses
    Machine:  (no direct link — AC covered by same machine)               ~
    Test:     accent-palette-state.test.ts:24                             ✓

The ~ symbol means: the machine's feature.ac field points to rightClickOpensPalette, not to this specific AC. But the test covers it via @Verifies, and the machine's source file is the same. The chain is not broken — it is just that a single machine satisfies multiple ACs.

Diagram
The closed traceability chain — from Feature abstract class to acceptance criterion to @FiniteStateMachine decorator to source code to test to @Verifies annotation. Every link is machine-verifiable.

The chain is circular — and that is the point. Every link can be verified from the adjacent link. The Feature class defines ACs. The decorator references an AC. The test verifies an AC. The scanner checks that all three agree. If any link is missing, the chain breaks and the build reports it.

audit-fsm-feature-links.ts — Orphan Detection

The audit script is the enforcement mechanism. It runs during the build and fails with a non-zero exit code if any machine violates the traceability requirements.

What It Reads

The script reads two inputs:

  1. data/state-machines.json — the extracted graph from Phase 2 (Part IX). Contains every machine with its feature field (or absence thereof).

  2. requirements/features/index.ts — the barrel export of all Feature abstract classes. The script imports this module and uses reflection to enumerate classes and their abstract methods.

The Audit Algorithm

// Simplified — the actual script has error handling and formatting

interface AuditResult {
  orphans:     string[];   // machines with no feature link
  mismatches:  MismatchEntry[];  // machines with invalid id or ac
  unlinked:    string[];   // ACs with no machine pointing to them
  coverage:    number;     // percentage of ACs with at least one machine
}

function auditFeatureLinks(
  graph: StateMachineGraph,
  features: Map<string, FeatureMetadata>,
  exemptions: Set<string>,
): AuditResult {
  const orphans: string[] = [];
  const mismatches: MismatchEntry[] = [];
  const acCoverage = new Map<string, boolean>();

  // Initialize AC coverage — all false
  for (const [id, meta] of features) {
    for (const ac of meta.acs) {
      acCoverage.set(`${id}:${ac}`, false);
    }
  }

  for (const machine of graph.machines) {
    // Skip exempted infrastructure machines
    if (exemptions.has(machine.id)) continue;

    if (!machine.feature) {
      orphans.push(machine.id);
      continue;
    }

    const { id, ac } = machine.feature;

    // Check if feature exists
    if (!features.has(id)) {
      mismatches.push({
        machine: machine.id,
        reason: `Feature '${id}' not found in requirements directory`,
      });
      continue;
    }

    // Check if AC exists
    const meta = features.get(id)!;
    if (!meta.acs.includes(ac)) {
      mismatches.push({
        machine: machine.id,
        reason: `AC '${ac}' not found on Feature '${id}' — available: ${meta.acs.join(', ')}`,
      });
      continue;
    }

    // Mark AC as covered
    acCoverage.set(`${id}:${ac}`, true);
  }

  const unlinked = [...acCoverage.entries()]
    .filter(([, covered]) => !covered)
    .map(([key]) => key);

  const total = acCoverage.size;
  const covered = total - unlinked.length;
  const coverage = total === 0 ? 100 : Math.round((covered / total) * 100);

  return { orphans, mismatches, unlinked, coverage };
}

The algorithm is straightforward: iterate machines, check feature links, accumulate violations. The interesting part is not the algorithm but the three categories of violation.

Violation Categories

Orphans — machines without any feature link. These are the most concerning: code that exists but cannot explain why. The audit reports them as warnings (for exempted machines) or errors (for everything else).

ERROR  Orphan machine: sidebar-expand-state
       No feature link declared. Add 'feature: { id: ..., ac: ... }' to the decorator.

Mismatches — machines with a feature link that points to a nonexistent Feature class or a nonexistent AC method. These are bugs — someone renamed a feature or removed an AC but forgot to update the machine.

ERROR  Mismatch: copy-feedback-state
       AC 'animateSuccess' not found on Feature 'COPY-FEEDBACK'
       Available ACs: triggerFeedback, autoReset

Unlinked ACs — acceptance criteria in the Feature class that no machine points to. These are requirements gaps — either the AC needs a machine, or the AC should be removed because it is no longer relevant.

WARN   Unlinked AC: PAGE-LOAD:errorRecovery
       No machine declares feature: { id: 'PAGE-LOAD', ac: 'errorRecovery' }

The distinction between errors and warnings is important. Orphans and mismatches are errors — they block the build. Unlinked ACs are warnings — they indicate a gap but do not block deployment, because a Feature class may define aspirational ACs that are not yet implemented.

Exit Codes

function exitCode(result: AuditResult): number {
  if (result.orphans.length > 0) return 1;
  if (result.mismatches.length > 0) return 1;
  // Unlinked ACs are warnings, not errors
  return 0;
}

A non-zero exit code propagates through the build chain and stops the deploy. The developer must either add the missing feature link, fix the mismatch, or add the machine to the exemption set (with a comment explaining why).

Sample Output

A successful audit prints a coverage summary:

audit-fsm-feature-links

  Machines:    43
  With link:   40
  Exempted:     3 (EventBusFsm, HotReloadActionsFsm, DevWatcherFsm)
  Orphans:      0

  Features:    23
  ACs total:   67
  ACs covered: 58 (87%)
  Unlinked:     9 (warnings)

  Mismatches:   0

  Result: PASS

The 87% AC coverage means 9 acceptance criteria exist in Feature classes but no machine has declared them as its ac value. These are either planned features or ACs that are covered by a machine that points to a different AC on the same feature (like the accent palette example above, where one machine covers three ACs). The scanner does not demand one-to-one mapping — it demands that every machine that does declare a link points to something real.

The Full Quality Gate Chain

The feature audit is one gate in a sequence of seven. Every gate must pass before the static site is deployed to Vercel. The gates run in order because each depends on the output of the previous one.

Diagram
The quality gate chain — seven stages, each blocking the next. A failure at any stage stops the pipeline. No cloud CI/CD — all gates run locally.

Gate 1: TypeScript Compiler

npx tsc --noEmit

The compiler checks all .ts files for type errors. This catches:

  • Event name typos (via EventDef phantom types)
  • Payload shape mismatches (via DetailOf conditional type)
  • Unauthorized emissions (via EventBus<TEmits, TListens>)
  • Invalid @Verifies annotations (via keyof Feature_*)
  • General type errors in machine factories, coordinators, and adapters

Exit condition: zero type errors. Any error stops the chain.

Gate 2: vitest with Coverage Gates

npx vitest run --coverage

Every file in src/lib/ has a 98% coverage threshold. The vitest configuration enforces this per-file:

// vitest.config.ts (simplified)
export default defineConfig({
  test: {
    coverage: {
      provider: 'v8',
      thresholds: {
        perFile: true,
        lines: 98,
        functions: 98,
        branches: 95,
        statements: 98,
      },
      include: ['src/lib/**/*.ts'],
    },
  },
});

The 98% threshold is not aspirational — it is enforced. If a new machine file drops below 98% line coverage, vitest exits with a non-zero code. The 95% branch threshold is slightly lower because some machines have wildcard transitions (from: '*') that generate branches the test does not need to exercise individually.

What the tests cover:

  • Every state transition in every machine
  • Every guard condition (true and false branches)
  • Every callback invocation
  • Edge cases: calling methods in wrong states, stale generation detection, timer injection

What the tests do not cover:

  • DOM interactions (those live in Playwright E2E tests)
  • The adapter layer (thin wiring code, tested by integration)
  • The CLI shells (no logic to test)

Exit condition: all tests pass, all coverage thresholds met.

Gate 3: Event Topology Scanner

npx tsx scripts/scan-event-topology.ts --strict

The topology scanner (Part VII) walks every source file, extracts dispatchEvent, addEventListener, bus.emit, and bus.on calls, and cross-references them against the decorator metadata. It enforces four invariants:

  1. Every emits declaration has a corresponding AST dispatch site
  2. Every listens declaration has a corresponding AST listener site
  3. No undeclared dispatches exist
  4. No phantom events (declared but never dispatched or listened to)

Exit condition: zero missing, zero undeclared. Delegated phantoms are allowed when the delegation chain is resolvable.

Gate 4: Feature Audit

npx tsx scripts/audit-fsm-feature-links.ts

The audit script described above. Checks that every non-exempted machine has a valid feature link, that the linked Feature class and AC exist, and reports coverage statistics.

Exit condition: zero orphans, zero mismatches.

Gate 5: Mermaid Validation

npx tsx scripts/validate-mermaids.ts

Walks every .md file, extracts every mermaid code block, and validates the syntax. Catches:

  • Unterminated strings
  • Invalid node IDs
  • Missing arrow targets
  • Subgraph nesting errors
  • Malformed style directives

Exit condition: zero syntax errors across all mermaid blocks.

npx tsx scripts/validate-md-links.ts

Walks every .md file, extracts every markdown link and image reference, and resolves them against the filesystem. Catches:

  • Broken internal links (file does not exist)
  • Broken image references (image file does not exist)
  • Anchor links to nonexistent headings

Exit condition: zero broken links.

Gate 7: Static Build

npm run build:static

The SSG pipeline: markdown to HTML, frontmatter extraction, TOC generation, mermaid rendering, asset bundling. This is the final gate. If all previous gates passed, the build should succeed — but it catches integration issues that no individual gate covers, such as missing frontmatter fields, circular parent references, or duplicate slugs.

Exit condition: build completes without errors, all output files generated.

The Chain as a Script

The full chain runs as a single npm script:

npm run test:all

Which expands to:

npx tsc --noEmit \
  && npx vitest run --coverage \
  && npx tsx scripts/scan-event-topology.ts --strict \
  && npx tsx scripts/audit-fsm-feature-links.ts \
  && npx tsx scripts/validate-mermaids.ts \
  && npx tsx scripts/validate-md-links.ts \
  && npm run build:static

Each command is chained with &&. A failure at any step stops the chain. The entire sequence takes 8-12 seconds on a warm cache. The developer runs it before pushing to main — there is no cloud CI/CD. The push triggers Vercel, which serves the static files. The quality gates are local, fast, and blocking.

Why Sequential, Not Parallel

The gates run sequentially for two reasons:

  1. Dependencies. The topology scanner depends on type-checked code (Gate 1). The feature audit depends on state-machines.json, which is generated during the build pipeline. The mermaid validator depends on source .md files that the link validator also reads.

  2. Fast failure. If the TypeScript compiler finds 47 type errors, there is no point running vitest — the tests will fail too, and the error output will be noise on top of the type errors. Sequential execution means the developer sees the earliest, most fundamental error first.

The total cost of sequential execution is small. Gate 1 (tsc) takes 2-3 seconds. Gate 2 (vitest) takes 3-4 seconds. Gates 3-6 take under 1 second each. Gate 7 (build) takes 2-3 seconds. The entire chain is under 12 seconds. Running gates in parallel would save at most 3-4 seconds — not enough to justify the complexity of parallel orchestration and interleaved error output.

Testing Typed Events

The rest of this part shifts from traceability to testing. The typed event system (Part II) and the pure state machines (Part IV) are designed for testability. No DOM. No browser. No async timers. Pure functions in, deterministic results out.

Testing EventBus: emit, on, unsubscribe

The EventBus wraps an EventTarget. In production, that target is window. In tests, it is a plain EventTarget instance — no DOM, no global state:

import { createEventBus, defineEvent } from '../src/lib/event-bus';
import type { EventBus } from '../src/lib/event-bus';

const TestEvent = defineEvent<'test-event', void>('test-event');
const DetailEvent = defineEvent<'detail-event', { count: number }>('detail-event');

describe('EventBus', () => {
  let target: EventTarget;
  let bus: EventBus<typeof TestEvent | typeof DetailEvent, typeof TestEvent | typeof DetailEvent>;

  beforeEach(() => {
    target = new EventTarget();
    bus = createEventBus(target);
  });

  it('delivers void events to listeners', () => {
    const handler = vi.fn();
    bus.on(TestEvent, handler);
    bus.emit(TestEvent);
    expect(handler).toHaveBeenCalledOnce();
  });

  it('delivers detail events with typed payload', () => {
    const handler = vi.fn();
    bus.on(DetailEvent, handler);
    bus.emit(DetailEvent, { count: 42 });
    expect(handler).toHaveBeenCalledWith(
      expect.objectContaining({ detail: { count: 42 } }),
    );
  });

  it('does not deliver after unsubscribe', () => {
    const handler = vi.fn();
    const sub = bus.on(TestEvent, handler);
    sub.unsubscribe();
    bus.emit(TestEvent);
    expect(handler).not.toHaveBeenCalled();
  });

  it('delivers to multiple listeners in registration order', () => {
    const order: number[] = [];
    bus.on(TestEvent, () => order.push(1));
    bus.on(TestEvent, () => order.push(2));
    bus.on(TestEvent, () => order.push(3));
    bus.emit(TestEvent);
    expect(order).toEqual([1, 2, 3]);
  });

  it('allows void events with zero arguments', () => {
    // This is a compile-time check more than a runtime one.
    // If TestEvent were a detail event, calling emit(TestEvent)
    // without a second argument would be a type error.
    bus.emit(TestEvent);
    // No assertion needed — the test passes if it compiles and runs.
  });
});

Key testing patterns:

  1. Fake EventTarget. The EventTarget constructor is available in Node.js 15+ and in vitest's environment. No JSDOM needed. No window global.

  2. vi.fn() spies. Vitest's spy functions verify that handlers are called, with what arguments, and how many times.

  3. Registration order. The bus delegates to addEventListener, which guarantees registration-order delivery. The test verifies this invariant.

  4. Compile-time checks as tests. The last test exists primarily to document that bus.emit(TestEvent) — with no second argument — compiles for void events. If someone changes TestEvent to carry a detail, this test becomes a compile error, not a runtime failure.

Testing Void vs Detail Events

The emit method has a variadic signature:

emit<E extends TEmits>(event: E, ...args: DetailOf<E> extends void ? [] : [DetailOf<E>]): void;

For void events, the spread produces zero additional arguments. For detail events, it produces exactly one. This is the compile-time guarantee. But the runtime behavior also matters:

it('void emit dispatches Event, not CustomEvent', () => {
  let received: Event | undefined;
  target.addEventListener('test-event', (e) => { received = e; });
  bus.emit(TestEvent);
  expect(received).toBeInstanceOf(Event);
  expect(received).not.toHaveProperty('detail');
});

it('detail emit dispatches CustomEvent with detail', () => {
  let received: Event | undefined;
  target.addEventListener('detail-event', (e) => { received = e; });
  bus.emit(DetailEvent, { count: 7 });
  expect(received).toBeInstanceOf(CustomEvent);
  expect((received as CustomEvent).detail).toEqual({ count: 7 });
});

Void events dispatch a plain Event. Detail events dispatch a CustomEvent with the typed detail property. The distinction matters because CustomEvent without a detail has detail: null, which is different from undefined — code that checks if (event.detail) would behave differently. The bus implementation handles this correctly, and the tests verify it.

Testing Pure FSMs

Every state machine follows the factory + callback pattern (Part IV). The factory takes a callbacks object and returns a machine interface with methods and a state() getter. Testing is trivial because the machine is a pure function of its inputs.

The Pattern

// The factory
export function createAccentPaletteState(callbacks: {
  onOpen:  () => void;
  onClose: () => void;
  onApply?: () => void;
}): AccentPaletteStateMachine {
  let state: AccentPaletteState = 'closed';

  return {
    state: () => state,
    openPalette() {
      if (state !== 'closed') return;
      state = 'open';
      callbacks.onOpen();
    },
    selectSwatch() {
      if (state !== 'open') return;
      state = 'applying';
      callbacks.onApply?.();
    },
    closePalette() {
      if (state !== 'open' && state !== 'applying') return;
      state = 'closed';
      callbacks.onClose();
    },
    applyDone() {
      if (state !== 'applying') return;
      state = 'closed';
      callbacks.onClose();
    },
  };
}

The Tests

describe('AccentPaletteState', () => {
  it('starts in closed state', () => {
    const m = createAccentPaletteState({ onOpen: vi.fn(), onClose: vi.fn() });
    expect(m.state()).toBe('closed');
  });

  it('transitions closed → open on openPalette', () => {
    const onOpen = vi.fn();
    const m = createAccentPaletteState({ onOpen, onClose: vi.fn() });
    m.openPalette();
    expect(m.state()).toBe('open');
    expect(onOpen).toHaveBeenCalledOnce();
  });

  it('ignores openPalette when already open', () => {
    const onOpen = vi.fn();
    const m = createAccentPaletteState({ onOpen, onClose: vi.fn() });
    m.openPalette();
    m.openPalette(); // second call
    expect(m.state()).toBe('open');
    expect(onOpen).toHaveBeenCalledOnce(); // not twice
  });

  it('transitions open → applying → closed', () => {
    const onClose = vi.fn();
    const onApply = vi.fn();
    const m = createAccentPaletteState({ onOpen: vi.fn(), onClose, onApply });
    m.openPalette();
    m.selectSwatch();
    expect(m.state()).toBe('applying');
    expect(onApply).toHaveBeenCalledOnce();
    m.applyDone();
    expect(m.state()).toBe('closed');
    expect(onClose).toHaveBeenCalledOnce();
  });

  it('ignores selectSwatch when closed', () => {
    const m = createAccentPaletteState({ onOpen: vi.fn(), onClose: vi.fn() });
    m.selectSwatch();
    expect(m.state()).toBe('closed'); // no transition
  });
});

What Makes This Testable

Three properties make every FSM trivially testable:

1. No DOM. The machine does not touch the DOM. It does not read document.querySelector. It does not write element.classList. It does not listen to mouse events. All DOM interaction is in the adapter (Part V), which is a separate module. The machine is a pure state transition function.

2. No async. The machine does not use setTimeout, requestAnimationFrame, or Promise. Timer-dependent machines accept a timer injection:

export function createCopyFeedbackState(callbacks: {
  onShow:  () => void;
  onHide:  () => void;
  setTimeout: (fn: () => void, ms: number) => number;
  clearTimeout: (id: number) => void;
}): CopyFeedbackStateMachine { ... }

In tests, the injected timer is a synchronous spy. No vi.advanceTimersByTime() needed — the test calls the captured callback directly.

3. Callbacks, not side effects. The machine signals state changes through callbacks, not through side effects. When openPalette fires, it calls callbacks.onOpen() — it does not add a CSS class. The test verifies that the callback was called with the right arguments. The adapter (untested in unit tests) translates callbacks into DOM effects.

This design means every unit test follows the same pattern:

  1. Create machine with spy callbacks
  2. Call method
  3. Assert state() changed
  4. Assert callback was called (or not)

No mocking. No async. No setup/teardown beyond vi.fn(). Each test runs in under 1ms.

Coverage Gates per File

The vitest configuration enforces 98% coverage per file in src/lib/. This means every machine file must have 98% line coverage. The enforcement is not a suggestion — it is a build-breaking gate:

ERROR  Coverage for 'src/lib/accent-palette-state.ts':
       Lines:      94.2% (threshold: 98%)
       Functions:  100%
       Branches:   90.0% (threshold: 95%)
       Statements: 94.2% (threshold: 98%)

When a developer adds a new state or transition without a corresponding test, the coverage drops below 98%, and the build fails. The fix is always the same: write the missing test. There is no escape hatch, no /* istanbul ignore */, no threshold override per file.

Testing Coordinators

Coordinators (Part V) orchestrate multiple machines. The TourCoordinator manages the tour machine, the tooltip machine, and the highlight machine. The ThemeCoordinator manages the theme machine, the accent palette machine, and the CSS custom property updates.

Testing a coordinator follows the same pattern as testing a single machine — but instead of creating one machine, you create multiple machines and pass them to the coordinator factory.

The Pattern

// Simplified TourCoordinator test

describe('TourCoordinator', () => {
  it('starts the tour by initializing all sub-machines', () => {
    const tourMachine = createTourState({
      onStepChange: vi.fn(),
      onComplete: vi.fn(),
    });
    const tooltipMachine = createTooltipState({
      onShow: vi.fn(),
      onHide: vi.fn(),
    });
    const highlightMachine = createHighlightState({
      onHighlight: vi.fn(),
      onClear: vi.fn(),
    });

    const coordinator = createTourCoordinator({
      tour: tourMachine,
      tooltip: tooltipMachine,
      highlight: highlightMachine,
    });

    coordinator.startTour();

    expect(tourMachine.state()).toBe('active');
    expect(tooltipMachine.state()).toBe('visible');
    expect(highlightMachine.state()).toBe('highlighting');
  });

  it('advances all sub-machines on nextStep', () => {
    const onStepChange = vi.fn();
    const tourMachine = createTourState({ onStepChange, onComplete: vi.fn() });
    const tooltipMachine = createTooltipState({ onShow: vi.fn(), onHide: vi.fn() });
    const highlightMachine = createHighlightState({ onHighlight: vi.fn(), onClear: vi.fn() });

    const coordinator = createTourCoordinator({
      tour: tourMachine,
      tooltip: tooltipMachine,
      highlight: highlightMachine,
    });

    coordinator.startTour();
    coordinator.nextStep();

    expect(onStepChange).toHaveBeenCalledWith(1); // step index
    // tooltip and highlight updated to new target
  });

  it('completes the tour and resets all machines', () => {
    const onComplete = vi.fn();
    const onClear = vi.fn();
    const tourMachine = createTourState({ onStepChange: vi.fn(), onComplete });
    const tooltipMachine = createTooltipState({ onShow: vi.fn(), onHide: vi.fn() });
    const highlightMachine = createHighlightState({ onHighlight: vi.fn(), onClear });

    const coordinator = createTourCoordinator({
      tour: tourMachine,
      tooltip: tooltipMachine,
      highlight: highlightMachine,
    });

    coordinator.startTour();
    coordinator.completeTour();

    expect(tourMachine.state()).toBe('completed');
    expect(tooltipMachine.state()).toBe('hidden');
    expect(highlightMachine.state()).toBe('cleared');
    expect(onComplete).toHaveBeenCalledOnce();
    expect(onClear).toHaveBeenCalledOnce();
  });
});

Why This Works Without Mocking

The coordinator does not create its sub-machines — it receives them. The test creates real machines (not mocks), passes them to the coordinator, and asserts their states after coordinator methods are called. No mocking framework. No stub generation. No jest.mock('../src/lib/tour-state').

This is the dependency injection principle applied to testing: the coordinator depends on interfaces (machine shapes), not on concrete modules. The test provides real implementations of those interfaces — the same factory functions used in production. The only fakes are the callbacks (vi.fn()), which are leaf-level spy functions with no behavior to mock.

The result: coordinator tests verify the actual coordination logic. They do not verify that mocks were called in the right order (a common anti-pattern with mock-heavy testing). If the tour machine changes its state names, the coordinator test fails with expected 'active' but received 'running' — a clear signal that the coordinator needs updating.

Property-Based Testing for State Machines

Unit tests verify specific scenarios: "from idle, call startLoad, expect loading." But what about scenarios the developer did not think of? What about random sequences of events — selectSwatch, closePalette, openPalette, applyDone, selectSwatch, closePalette — applied in an order that no human would write as a test case?

Property-based testing generates random inputs and asserts that invariants hold after every input. For state machines, the inputs are random sequences of method calls, and the invariants are structural properties that must be true regardless of the sequence.

The Approach

import fc from 'fast-check';

describe('AccentPaletteState — property-based', () => {
  const methods = ['openPalette', 'selectSwatch', 'closePalette', 'applyDone'] as const;
  type Method = typeof methods[number];

  it('state is always one of the declared states', () => {
    fc.assert(
      fc.property(
        fc.array(fc.constantFrom(...methods), { minLength: 0, maxLength: 50 }),
        (sequence: Method[]) => {
          const m = createAccentPaletteState({ onOpen: () => {}, onClose: () => {} });
          for (const method of sequence) {
            m[method]();
          }
          expect(['closed', 'open', 'applying']).toContain(m.state());
        },
      ),
    );
  });

  it('closePalette always leads to closed state', () => {
    fc.assert(
      fc.property(
        fc.array(fc.constantFrom(...methods), { minLength: 0, maxLength: 50 }),
        (sequence: Method[]) => {
          const m = createAccentPaletteState({ onOpen: () => {}, onClose: () => {} });
          for (const method of sequence) {
            m[method]();
          }
          m.closePalette();
          // After closePalette, state must be closed (unless already closed)
          // The machine only transitions from 'open' or 'applying' to 'closed'
          // If we were in 'closed', closePalette is a no-op, still 'closed'
          expect(m.state()).toBe('closed');
        },
      ),
    );
  });

  it('no method ever throws', () => {
    fc.assert(
      fc.property(
        fc.array(fc.constantFrom(...methods), { minLength: 0, maxLength: 100 }),
        (sequence: Method[]) => {
          const m = createAccentPaletteState({ onOpen: () => {}, onClose: () => {} });
          for (const method of sequence) {
            m[method](); // should never throw
          }
        },
      ),
    );
  });
});

What Invariants to Assert

For state machines, the most useful invariants are:

1. State is always valid. After any sequence of events, state() returns one of the declared states. This catches off-by-one errors in state assignments, race conditions in callback-triggered transitions, and forgotten return statements that allow fall-through.

2. Terminal methods are idempotent. Calling closePalette() twice from open should leave the machine in closed and call onClose once (not twice). Property-based testing generates sequences that repeat the same method, revealing double-fire bugs.

3. No method throws. A pure state machine should never throw. Invalid transitions are no-ops, not exceptions. Property-based testing generates adversarial sequences — calling selectSwatch from closed, calling applyDone from open — and verifies that none of them throw.

4. Structural invariants. Some machines have cross-state invariants. For example, the TerminalDotsStateFsm has a compound state: when focusMode is true, sidebarMasked must also be true. Property-based testing verifies this invariant after every event in every generated sequence:

it('focusMode implies sidebarMasked', () => {
  fc.assert(
    fc.property(
      fc.array(fc.constantFrom(...terminalDotsMethods), { minLength: 0, maxLength: 50 }),
      (sequence) => {
        const m = createTerminalDotsState({ /* callbacks */ });
        for (const method of sequence) {
          m[method]();
          if (m.focusMode()) {
            expect(m.sidebarMasked()).toBe(true);
          }
        }
      },
    ),
  );
});

This test found a real bug: a sequence of maximize, toggleSidebar, minimize left the machine in a state where focusMode was false but sidebarMasked was still true — because minimize cleared focus mode but did not clear the sidebar mask. The fix was a one-line change in the minimize method. A human-written test would not have caught it because no developer would think to write that specific three-event sequence.

fast-check Integration

The property-based tests use fast-check, a JavaScript property-based testing library. It integrates with vitest through fc.assert, which runs the property 100 times by default with random inputs. On failure, it shrinks the input to the smallest failing case — a three-event sequence instead of a fifty-event sequence — making the failure easy to debug.

The tests run as part of the standard npx vitest run --coverage command. They are not a separate test suite. They coexist with the deterministic unit tests in the same test file. The coverage they generate counts toward the 98% threshold — in fact, property-based tests often push coverage above 99% because they exercise code paths that deterministic tests miss.

The Test Pyramid

Diagram
The test pyramid for the typed event and FSM system. Unit tests cover 43 pure machines and the EventBus. Integration tests cover coordinators and adapters. E2E tests cover the full browser experience.

The pyramid has three layers:

Unit Tests (Base)

  • 43 machine tests — one test file per machine, 98%+ coverage per file
  • EventBus tests — emit, on, unsubscribe, void vs detail, multi-listener
  • Property-based tests — random event sequences, invariant assertions
  • Scanner tests — topology verification with in-memory source files
  • Renderer tests — SVG generation with stub layout engine
  • Cache tests — pure decideCacheAction with all boolean combinations

Count: ~280 test cases. Duration: 3-4 seconds.

Integration Tests (Middle)

  • Coordinator tests — real sub-machines, no mocks, verify orchestration
  • Adapter smoke tests — verify that adapter functions call the right DOM APIs
  • Build pipeline tests — run the full extract + render pipeline against fixture files

Count: ~40 test cases. Duration: 1-2 seconds.

E2E Tests (Top)

  • Playwright tests — run against the static build (TEST_TARGET=static)
  • Visual regression — screenshot comparison for the explorer, theme transitions
  • Accessibility — axe-core integration, four-theme pa11y sweep
  • Performance — lighthouse metrics, largest contentful paint thresholds

Count: ~30 test cases. Duration: 15-25 seconds.

The ratio is intentional. The base is wide (280 unit tests) because pure machines are cheap to test and cover the most logic. The middle is narrow (40 integration tests) because coordinators have less logic to verify — they mostly delegate. The top is small (30 E2E tests) because browser tests are slow, flaky, and expensive. The pyramid ensures that most defects are caught at the base, where feedback is fastest.

The Economics of Quality Gates

Seven gates. Twelve seconds total. That is the cost of knowing that every machine has a purpose, every event contract is verified, every test passes, every link resolves, and every diagram renders.

The alternative is manual verification. A developer opens a PR, a reviewer reads the code, and they both hope that the event contracts are correct, the feature links are valid, and the coverage is sufficient. That process takes 20-30 minutes per PR and catches maybe 60% of the issues that the automated gates catch in 12 seconds.

The gates are not a burden. They are a force multiplier. They free the developer to focus on the interesting problems — the state machine design, the event topology, the coordination logic — because the boring problems (typos, stale links, orphan machines, phantom events) are caught automatically.

What the Gates Cannot Catch

The gates verify structure, not semantics. They can verify that a machine declares feature: { id: 'PAGE-LOAD', ac: 'fullLifecycle' } and that Feature_PAGE_LOAD.fullLifecycle exists. They cannot verify that the machine actually implements the full page load lifecycle. That requires a human to read the code and understand the domain.

The gates verify coverage, not correctness. A test that asserts expect(true).toBe(true) satisfies the coverage threshold but verifies nothing. The 98% threshold ensures that code is exercised, not that the assertions are meaningful. Meaningful assertions come from developers who understand what the machine should do — not from tooling.

The gates verify syntax, not intent. The mermaid validator catches malformed diagrams but not misleading diagrams. The link validator catches broken links but not links that point to the wrong target. The topology scanner catches missing declarations but not declarations that are semantically wrong.

This is the division of labor. The tooling handles the tedious, mechanical, error-prone verification. The human handles the judgment calls. Neither can replace the other. Together, they cover more ground than either alone.

Putting It All Together

The traceability system has four layers:

  1. DeclarationFsmFeatureLink { id, ac } in the decorator links a machine to a business requirement.

  2. Definition — Feature abstract classes in requirements/features/ define acceptance criteria as abstract methods.

  3. Verification@Verifies annotations in tests link test cases to acceptance criteria. The compiler catches AC name typos via keyof.

  4. Enforcementaudit-fsm-feature-links.ts checks that every machine has a valid link, that every Feature class and AC exist, and that no orphan machines slip through. The exit code blocks the deploy.

The quality gate chain adds three more verification layers on top:

  1. Type safety — the TypeScript compiler catches structural errors in events, payloads, and bus configurations.

  2. Behavioral coverage — vitest with 98% per-file thresholds ensures every transition, guard, and callback is exercised.

  3. Topology integrity — the drift scanner ensures that declared events match actual AST sites.

Seven layers. Each catches a different class of defect. Each is automated. Each runs in under 12 seconds total. The result is a codebase where 40 of 43 machines are traceable to a business requirement, every event contract is compiler-verified and scanner-enforced, and every machine has 98%+ test coverage with property-based invariant checking.

The machines are pure. The events are typed. The topology is scanned. The requirements are linked. The tests are comprehensive. And the gates ensure that none of these properties can degrade without the build telling you — immediately, specifically, and with a non-zero exit code.

Continue to Part XII: Advanced Patterns and Retrospective →

⬇ Download