Part VI: Testing Pure State Machines

98% statement coverage. 600ms unit test suite. Zero flaky tests. Zero DOM dependencies. The state machines are the most-tested code in the project because they are the easiest code to test.

The Payoff

The entire point of extracting state machines with callback injection (Parts II-IV) was testability. The entire point of migrating to TypeScript (Part V) was to make the interfaces compile-time contracts. This part shows the payoff: a testing strategy that's fast, thorough, and -- crucially -- not fragile.

The testing pyramid:

Diagram — The testing pyramid for this site: unit tests do the heavy lifting on the machines, property tests cover cross-cutting invariants, and E2E only checks the wiring.

Each layer catches a different class of bugs:

Unit tests catch logic errors in state transitions, guard clauses, and pure helpers.
Property-based tests catch edge cases that example-based tests miss (empty strings, unicode, extreme values).
E2E tests catch integration errors where the wiring layer connects machines to the real DOM.

The Unit Test Setup Pattern

Every state machine test follows the same setup() pattern. Here's the canonical example from spa-nav-state.test.ts:

import { expect } from 'vitest';
import { FeatureTest, Verifies } from '../../requirements/decorators';
import { SpaNavFeature } from '../../requirements/features/spa-nav';
import {
  createSpaNavMachine,
  classifyNavigation,
  type SpaNavCallbacks,
  type SpaNavState,
} from '../../src/lib/spa-nav-state';

function setup() {
  const stateChanges: SpaNavState[] = [];
  const callbacks: SpaNavCallbacks = {
    onStateChange: vi.fn((state: SpaNavState) => {
      stateChanges.push(state);
    }),
    scrollToHash: vi.fn(),
    toggleHeadings: vi.fn(),
    startFetch: vi.fn(),
    closeHeadings: vi.fn(() => false),
    swapContent: vi.fn(),
    updateActiveItem: vi.fn(),
    pushHistory: vi.fn(),
    postSwap: vi.fn(),
  };
  const machine = createSpaNavMachine(callbacks);
  return { machine, callbacks, stateChanges };
}

Three things returned:

machine — the state machine instance, ready to receive events.
callbacks — all callbacks wrapped in vi.fn() (Vitest mock functions). Every call is recorded: arguments, call count, call order.
stateChanges — an array that captures every state transition. Instead of checking machine.getState() (which only shows the current state), you can verify the full transition history.

This pattern gives tests complete visibility into the machine's behavior without touching the DOM.

Testing State Transitions

@FeatureTest(SpaNavFeature)
class SpaNavFullNavigationTests {
  @Verifies<SpaNavFeature>('fullNavLifecycle')
  'walks through idle → fetching → swapping → settled'() {
    const { machine, callbacks, stateChanges } = setup();

    machine.navigate('/skills', '/about', null, '/skills');

    expect(machine.getState()).toBe('fetching');
    expect(callbacks.startFetch).toHaveBeenCalledWith('/skills');

    machine.fetchComplete('<h1>Skills</h1>');

    // closeHeadings returned false, so skip closingHeadings
    expect(stateChanges).toEqual(['fetching', 'swapping', 'settled']);
    expect(callbacks.swapContent).toHaveBeenCalledWith('<h1>Skills</h1>', '/skills');
    expect(callbacks.pushHistory).toHaveBeenCalledWith('/skills', false);
    expect(callbacks.postSwap).toHaveBeenCalledWith('/skills', null);
  }
}

The test drives the machine through a complete navigation cycle by calling methods in sequence: navigate(), then fetchComplete(). After each call, it asserts the state and which callbacks were invoked.

The stateChanges array is key. It captures the full sequence: ['fetching', 'swapping', 'settled']. This verifies not just the final state but every intermediate transition. If the machine accidentally transitions through an unexpected state, the array will show it.

Testing the closingHeadings Branch

@FeatureTest(SpaNavFeature)
class SpaNavClosingHeadingsTests {
  @Verifies<SpaNavFeature>('closingHeadingsBranch')
  'includes closingHeadings when headings are open'() {
    const { machine, callbacks, stateChanges } = setup();
    (callbacks.closeHeadings as ReturnType<typeof vi.fn>).mockReturnValue(true);

    machine.navigate('/skills', '/about', null, '/skills');
    machine.fetchComplete('<h1>Skills</h1>');

    expect(machine.getState()).toBe('closingHeadings');
    expect(stateChanges).toEqual(['fetching', 'closingHeadings']);

    machine.transitionEnd();

    expect(stateChanges).toEqual(['fetching', 'closingHeadings', 'swapping', 'settled']);
  }
}

By mocking closeHeadings to return true, the test forces the machine into the closingHeadings branch. Then it calls transitionEnd() to simulate the CSS animation completing. The test verifies the full transition sequence including the animation state.

Testing Guard Clauses

Guard clauses are the most important thing to test. They ensure the machine rejects invalid transitions:

@FeatureTest(SpaNavFeature)
class SpaNavGuardClauseTests {
  @Verifies<SpaNavFeature>('ignoresFetchCompleteWhenNotFetching')
  'ignores fetchComplete when not fetching'() {
    const { machine, callbacks } = setup();

    // Machine is in 'idle' — fetchComplete should be a no-op
    machine.fetchComplete('<h1>Nope</h1>');

    expect(machine.getState()).toBe('idle');
    expect(callbacks.swapContent).not.toHaveBeenCalled();
  }

  @Verifies<SpaNavFeature>('ignoresTransitionEndWhenNotClosing')
  'ignores transitionEnd when not in closingHeadings'() {
    const { machine, callbacks } = setup();

    machine.navigate('/skills', '/about', null, '/skills');
    expect(machine.getState()).toBe('fetching');

    // transitionEnd should be ignored in 'fetching' state
    machine.transitionEnd();

    expect(machine.getState()).toBe('fetching');
  }
}

The pattern: call a method that shouldn't work in the current state, then verify that (a) the state didn't change and (b) no callbacks were fired. This is the negative testing that prevents the race conditions from Part I.

Guard Clause Testing for CopyFeedbackMachine

The CopyFeedbackMachine has an explicit transition table. Testing it:

@FeatureTest(CopyFeedbackFeature)
class CopyFeedbackInvalidTransitionTests {
  @Verifies<CopyFeedbackFeature>('noSucceedFromIdle')
  'does not allow succeed() from idle'() {
    const { machine } = setup();
    machine.succeed();
    expect(machine.getState()).toBe('idle');
  }

  @Verifies<CopyFeedbackFeature>('noFailFromIdle')
  'does not allow fail() from idle'() {
    const { machine } = setup();
    machine.fail();
    expect(machine.getState()).toBe('idle');
  }

  @Verifies<CopyFeedbackFeature>('noDoubleSucceed')
  'does not allow copy() → succeed() → succeed()'() {
    const { machine } = setup();
    machine.copy();
    machine.succeed();
    expect(machine.getState()).toBe('success');
    machine.succeed();  // Already in success — should be no-op
    expect(machine.getState()).toBe('success');
  }
}

Each test verifies that one invalid transition is rejected. Together, they exercise the canTransition() guard function from every invalid starting state.

Guard Clause Testing for PageLoadMachine

The generation counter is a different kind of guard -- it checks identity, not state:

@FeatureTest(PageLoadFeature)
class PageLoadStaleGenerationTests {
  @Verifies<PageLoadFeature>('rejectsStaleGeneration')
  'rejects markRendering with wrong generation'() {
    const { machine, callbacks } = setup();
    const gen1 = machine.startLoad();
    const gen2 = machine.startLoad();  // Bumps generation

    const result = machine.markRendering(gen1);  // Stale!

    expect(result).toBe(false);
    expect(callbacks.onStale).toHaveBeenCalledWith(gen1, gen2);
    expect(machine.getState().state).toBe('loading');  // Unchanged
  }
}

The test starts two loads, then tries to advance the first one. The machine detects the stale generation, calls onStale, and returns false. The state stays in loading (for the second load).

Testing Pure Helpers

Pure functions get their own test blocks. No setup() needed:

@FeatureTest(NavigationHelpersFeature)
class ClassifyNavigationTests {
  @Verifies<NavigationHelpersFeature>('hashScroll')
  'returns hashScroll for same page with hash'() {
    expect(classifyNavigation('/about', '/about', '#education')).toBe('hashScroll');
  }

  @Verifies<NavigationHelpersFeature>('toggleHeadings')
  'returns toggleHeadings for same page without hash'() {
    expect(classifyNavigation('/about', '/about', null)).toBe('toggleHeadings');
  }

  @Verifies<NavigationHelpersFeature>('fullNavigationDifferentPage')
  'returns fullNavigation for different page'() {
    expect(classifyNavigation('/skills', '/about', null)).toBe('fullNavigation');
  }

  @Verifies<NavigationHelpersFeature>('fullNavigationWithHash')
  'returns fullNavigation for different page even with hash'() {
    expect(classifyNavigation('/skills', '/about', '#typescript')).toBe('fullNavigation');
  }
}

Four tests. Four inputs. Four expected outputs. No mocks. No setup. This is what pure functions buy you: trivially testable logic.

The detectByScroll function from ScrollSpyMachine:

@FeatureTest(ScrollSpyFeature)
class DetectByScrollTests {
  @Verifies<ScrollSpyFeature>('lastHeadingAboveThreshold')
  'returns last heading at or above threshold'() {
    const headings = [
      { id: 'h1', top: 0 },
      { id: 'h2', top: 100 },
      { id: 'h3', top: 200 },
    ];
    expect(detectByScroll(headings, 150)).toBe('h2');
  }

  @Verifies<ScrollSpyFeature>('firstHeadingFallback')
  'returns first heading when none are above threshold'() {
    const headings = [{ id: 'h1', top: 100 }];
    expect(detectByScroll(headings, 50)).toBe('h1');
  }

  @Verifies<ScrollSpyFeature>('nullForEmptyHeadings')
  'returns null for empty headings'() {
    expect(detectByScroll([], 100)).toBeNull();
  }
}

Geometry as plain objects. No DOM. No scroll events. The function receives { id, top }[] and returns a string.

Property-Based Testing with fast-check

Example-based tests verify specific inputs. Property-based tests verify invariants -- statements that should hold for all possible inputs.

The property-based.test.ts file uses fast-check to generate random inputs and verify that invariants hold:

import fc from 'fast-check';
import { expect } from 'vitest';
import { FeatureTest, Verifies } from '../../requirements/decorators';
import { SlugFeature } from '../../requirements/features/slug';
import { slugify } from '../../src/lib/helpers';

@FeatureTest(SlugFeature)
class SlugifyInvariantTests {
  @Verifies<SlugFeature>('neverThrows')
  'never throws for any string input'() {
    fc.assert(
      fc.property(fc.string(), (input) => {
        expect(() => slugify(input)).not.toThrow();
      })
    );
  }

  @Verifies<SlugFeature>('alwaysLowercase')
  'always produces lowercase output'() {
    fc.assert(
      fc.property(fc.string(), (input) => {
        const result = slugify(input);
        expect(result).toBe(result.toLowerCase());
      })
    );
  }

  @Verifies<SlugFeature>('noConsecutiveDashes')
  'never produces consecutive dashes'() {
    fc.assert(
      fc.property(fc.string(), (input) => {
        const result = slugify(input);
        expect(result).not.toMatch(/--/);
      })
    );
  }

  @Verifies<SlugFeature>('idempotent')
  'is idempotent'() {
    fc.assert(
      fc.property(fc.string(), (input) => {
        const once = slugify(input);
        const twice = slugify(once);
        expect(twice).toBe(once);
      })
    );
  }
}

fast-check generates 100 random strings per test by default. Each one is checked against the invariant. If any string violates the invariant, fast-check shrinks it to the smallest counterexample and reports it.

These tests found a real bug: slugify(' ') (all spaces) produced an empty string, which broke URL generation. The fix: return a default slug for empty results.

Other properties tested:

matchScore() returns 0 for empty query (with arbitrary target strings)
parseFrontmatter() returns valid YAML for any input (never throws)
buildHierarchicalSlug() always produces a valid slug even from unicode input

Property-based tests complement example tests. Examples are readable -- they document the expected behavior. Properties are thorough -- they find edge cases humans miss.

More Property Invariants

The property-based test file covers more than just slugify. Here are other invariants tested:

@FeatureTest(SearchFeature)
class MatchScoreInvariantTests {
  @Verifies<SearchFeature>('zeroForEmptyQuery')
  'returns 0 for empty query'() {
    fc.assert(
      fc.property(fc.string(), (target) => {
        expect(matchScore('', target)).toBe(0);
      })
    );
  }

  @Verifies<SearchFeature>('nonNegativeScores')
  'returns non-negative scores'() {
    fc.assert(
      fc.property(fc.string(), fc.string(), (query, target) => {
        expect(matchScore(query, target)).toBeGreaterThanOrEqual(0);
      })
    );
  }

  @Verifies<SearchFeature>('exactMatchHigherThanPartial')
  'scores exact match higher than partial'() {
    fc.assert(
      fc.property(
        fc.string({ minLength: 3 }).filter(s => /[a-z]/i.test(s)),
        (str) => {
          const exact = matchScore(str, str);
          const partial = matchScore(str.slice(0, Math.ceil(str.length / 2)), str);
          expect(exact).toBeGreaterThanOrEqual(partial);
        }
      )
    );
  }
}

@FeatureTest(SlugFeature)
class HierarchicalSlugInvariantTests {
  @Verifies<SlugFeature>('validSlugFormat')
  'produces valid slug format (no consecutive dashes, no leading/trailing dashes)'() {
    fc.assert(
      fc.property(fc.string(), fc.string(), (parent, child) => {
        const slug = buildHierarchicalSlug(parent, child);
        expect(slug).not.toMatch(/^-/);     // No leading dash
        expect(slug).not.toMatch(/-$/);     // No trailing dash
        expect(slug).not.toMatch(/---/);    // No triple dashes (double is separator)
      })
    );
  }
}

The filter on fc.string() narrows generated strings to those that are meaningful for the test. Without the filter, fast-check would generate strings of null bytes and control characters -- valid inputs but not useful for testing "exact match scores higher than partial."

When Property Tests Caught Bugs

Bug 1: slugify('') returned ''. The property "never produces empty output for non-empty input" caught this. Empty input ('') producing empty output is fine. But slugify(' ') (all spaces) also produced '', which broke URL generation. The fix: return 'untitled' for empty results after stripping.

Bug 2: buildHierarchicalSlug with unicode parents. The property "no triple dashes" caught that buildHierarchicalSlug('café', 'intro') produced 'caf---intro' because the é was stripped, leaving a trailing dash that merged with the -- separator. The fix: strip trailing dashes from each segment before joining.

These are exactly the kind of bugs that example-based tests miss because nobody thinks to test slugify(' ') or buildHierarchicalSlug('café', 'intro'). Property-based testing generates inputs that humans don't consider.

Coverage Thresholds as Quality Gates

The vitest.config.js enforces coverage thresholds:

coverage: {
  provider: 'v8',
  include: ['src/lib/**/*.ts', 'scripts/build-static.js'],
  thresholds: {
    'src/lib/**/*.ts': {
      statements: 98,
      branches: 95,
      functions: 98,
      lines: 99,
    },
    'scripts/build-static.js': {
      statements: 100,
      branches: 100,
      functions: 100,
      lines: 100,
    },
  },
},

If any threshold drops below the configured value, npm test fails. This is a ratchet -- it can go up but never down.

Why 98/95/98/99?

Statements at 98%: Some lines are truly unreachable in unit tests -- error callbacks that only fire in integration scenarios, TypeScript exhaustiveness default cases that can't actually execute.

Branches at 95%: TypeScript exhaustiveness checks generate unreachable branches in the compiled JavaScript output. V8 sees the generated default: throw new Error("unreachable") branch and reports it as uncovered. The 95% threshold accommodates these synthetic branches.

Functions at 98%: Every exported function must be tested. The 2% slack is for internal utility functions that are tested transitively.

Lines at 99%: Nearly every line must execute during tests. The 1% slack accounts for the same exhaustiveness issues as branches.

Why 100% for build-static.js?

The build script is critical infrastructure. A bug in the build script can silently corrupt the entire site output. 100% coverage means every code path in the build pipeline is exercised by tests -- including error handling, edge cases in frontmatter parsing, and the minification pipeline.

V8 ignore markers are used sparingly in the build script to exclude JavaScript artifacts that V8 counts as branches but aren't real decision points:

/* v8 ignore next 2 */
if (false) { /* unreachable, but V8 counts it */ }

These are documented and minimal. The goal is 100% real coverage, not 100% metric gaming.

E2E Tests with Playwright

Unit tests verify the state machines. E2E tests verify the wiring layer -- the integration of machines with the DOM, events, and CSS.

Dual Target

The Playwright config supports two test targets:

const servers = {
  dev: { command: 'npx serve . -p 4001', port: 4001, url: 'http://localhost:4001' },
  static: { command: 'npx serve public -p 4000', port: 4000, url: 'http://localhost:4000' },
};

TEST_TARGET=dev: tests against the development server (source files, unminified)
TEST_TARGET=static: tests against the static build (minified, production-like)

The same test suite runs against both targets. This catches bugs where the static build process changes behavior -- minification breaking variable names, CSS purging removing used classes, etc.

test('should navigate between pages', async ({ page }) => {
  await page.goto('/');

  // Click a TOC item
  await page.click('[data-path="content/about.md"]');

  // Verify URL changed
  await expect(page).toHaveURL(/#content\/about\.md/);

  // Verify content loaded
  await expect(page.locator('#content h1')).toContainText('Stéphane');

  // Verify TOC highlight
  await expect(page.locator('[data-path="content/about.md"]'))
    .toHaveClass(/active/);
});

This test exercises the full navigation pipeline: click → SpaNavMachine → fetch → PageLoadMachine → DOM swap → TOC update. It doesn't test the state machines directly (that's what unit tests do). It tests that the wiring layer correctly connects the machines to the DOM.

Representative Test: Copy Button

test('should show success feedback after copy', async ({ page }) => {
  await page.goto('/#content/blog/this-website.md');

  // Find a code block with a copy button
  const copyBtn = page.locator('.copy-btn').first();
  await expect(copyBtn).toBeVisible();

  // Grant clipboard permissions
  await page.context().grantPermissions(['clipboard-write']);

  // Click copy
  await copyBtn.click();

  // Verify success feedback (✅)
  await expect(copyBtn).toContainText('✅');

  // Wait for auto-reset (2 seconds)
  await expect(copyBtn).toContainText('📋', { timeout: 3000 });
});

This test verifies the full CopyFeedbackMachine lifecycle through the DOM: click → copying → success → auto-reset. The timeout: 3000 accommodates the 2-second reset delay plus assertion overhead.

Test Configuration

export default defineConfig({
  workers: 4,
  fullyParallel: true,
  retries: 2,
  timeout: 15000,
  use: {
    headless: true,
    screenshot: 'only-on-failure',
  },
  projects: [
    { name: 'chromium', use: { browserName: 'chromium' } },
  ],
});

4 parallel workers. Chromium only (this is a portfolio site, not a cross-browser library). 2 retries for flake resilience. Screenshots on failure for debugging.

Representative Test: Scroll Spy

test('should highlight active TOC item on scroll', async ({ page }) => {
  await page.goto('/#content/blog/this-website.md');

  // Scroll down past the first heading
  await page.evaluate(() => {
    const content = document.getElementById('content');
    content?.scrollTo({ top: 800 });
  });

  // Wait for scroll spy to update
  await page.waitForTimeout(300);

  // Verify that a TOC item has the active class
  const activeItems = page.locator('.toc-item.active');
  await expect(activeItems).toHaveCount(1);

  // Scroll further
  await page.evaluate(() => {
    const content = document.getElementById('content');
    content?.scrollTo({ top: 2000 });
  });

  await page.waitForTimeout(300);

  // Verify the active item changed
  const newActive = page.locator('.toc-item.active');
  await expect(newActive).toHaveCount(1);
});

This test exercises the ScrollSpyMachine through the DOM. It doesn't test the machine's detectByScroll() function directly -- that's what unit tests do. It tests that the wiring layer correctly reads scroll positions, calls the machine, and applies the CSS class.

test('should toggle help modal with ? key', async ({ page }) => {
  await page.goto('/');

  // Press ? to open help
  await page.keyboard.press('Shift+/');  // ? is Shift+/

  // Verify help modal is visible
  const helpModal = page.locator('.help-modal');
  await expect(helpModal).toBeVisible();

  // Press Escape to close
  await page.keyboard.press('Escape');
  await expect(helpModal).not.toBeVisible();
});

This exercises the KeyboardNavState's handleKey() and handleEscape() through the real DOM. The unit test verifies that handleKey('?', { ctrl: false, alt: false }, new Set()) returns { type: 'toggleHelp' }. The E2E test verifies that the wiring layer actually opens and closes the modal.

Test Results and Reporting

Both unit and E2E tests produce structured output:

// vitest.config.js
reporters: ['default', 'html'],
outputFile: {
  html: `${runDir}/unit/index.html`,
},

// playwright.config.js
reporter: [
  ['list'],
  ['html', { outputFolder: `${runDir}/report`, open: 'never' }],
  ['json', { outputFile: `${runDir}/results.json` }],
],

Each test run gets a timestamped directory under test-results/. The HTML reports are browseable, the JSON results are machine-readable. The compliance scanner consumes the JSON to verify requirement coverage.

Requirements Tracking with Decorators

The test suite uses custom TypeScript decorators to link tests to feature requirements:

import { FeatureTest, Verifies } from '../../requirements/decorators';
import { CopyButtons } from '../../requirements/features/copy-buttons';

@FeatureTest(CopyButtons)
class CopyButtonTests {
  @Verifies<CopyButtons>('showsClipboardEmojiInIdle')
  'shows clipboard emoji in idle state'() {
    const machine = createCopyFeedback({ resetDelay: 2000 });
    expect(getLabel(machine.getState())).toBe('📋');
  }

  @Verifies<CopyButtons>('showsCheckmarkOnSuccess')
  'shows checkmark on success'() {
    const machine = createCopyFeedback({ resetDelay: 2000 });
    machine.copy();
    machine.succeed();
    expect(getLabel(machine.getState())).toBe('✅');
  }
}

The @FeatureTest(CopyButtons) decorator marks the class as a test suite for the CopyButtons feature. The @Verifies<CopyButtons>('acName') decorator links each test method to a specific acceptance criterion.

A compliance scanner (scripts/compliance-report.ts) reads the decorator metadata and generates a report:

Feature: CopyButtons
  ✓ shows clipboard emoji in idle state → CopyButtonTests.testIdleLabel
  ✓ shows checkmark on success → CopyButtonTests.testSuccessLabel
  ✓ shows X on error → CopyButtonTests.testErrorLabel
  ✓ auto-resets after delay → CopyButtonTests.testAutoReset
  ✗ handles rapid double-click → (no test found)

Coverage: 4/5 acceptance criteria (80%)

This is the traceability system described in the Typed Specs series. It ensures that every acceptance criterion has at least one test, and that orphan tests (tests that don't map to any requirement) are flagged for review.

What Testing Pure State Machines Enables

Speed

The entire unit test suite runs in ~600ms. No browser startup. No DOM initialization. No network requests. Each test creates a machine, calls methods, and asserts state. The V8 engine runs thousands of state transitions per second.

Compare this to E2E tests: ~30 seconds for 10 spec files. Playwright needs to launch a browser, navigate to pages, wait for animations, and verify DOM state. E2E tests are necessary but slow. Unit tests are fast and should catch most bugs.

No Flaky Tests

The state machines have zero flaky tests because they have zero dependencies on timing, network, or DOM rendering. A machine transition is synchronous. vi.fn() records calls synchronously. The test asserts synchronously. There's nothing to wait for, nothing to race, nothing to retry.

The E2E tests have occasional flakes (animation timing, network latency) -- that's why they have 2 retries. But the unit tests have 0 retries. They either pass or fail, deterministically, every time.

Regression Prevention

The coverage thresholds enforce that new machines get tested before merge. If someone adds a new state machine to src/lib/ without writing tests, the coverage for src/lib/**/*.ts drops below 98% and the build fails.

This is a structural enforcement, not a process enforcement. It doesn't require code review to catch missing tests. It doesn't require a CI bot to comment. The build breaks. The developer writes tests. The build passes.

Design Feedback

If a state machine is hard to test, it's probably poorly designed. Testing difficulty is a signal:

Too many states? → Decompose into smaller machines (like the TOC cluster in Part IV).
Can't inject a dependency? → Add it to the callbacks interface.
Need DOM to trigger a transition? → Extract the logic into a pure function.

The test suite is the first consumer of the state machine API. If the test is awkward, the API is awkward. Fix the API, not the test.

The Testing Strategy Summarized

Layer	Tool	What it tests	Speed	Flakiness
Unit	Vitest	State transitions, guard clauses, pure helpers	~600ms	Zero
Property	fast-check	Invariants across random inputs	~200ms	Zero
E2E	Playwright	Wiring layer + DOM integration	~30s	Low (2 retries)
Compliance	Custom scanner	Test ↔ requirement mapping	~100ms	Zero

TypeScript catches type errors at compile time. Unit tests catch logic errors at test time. Property tests catch edge cases at test time. E2E tests catch integration errors at test time. The compliance scanner catches coverage gaps at build time.

Each layer has a different cost/benefit ratio. TypeScript is free (no runtime cost). Unit tests are cheap (600ms). Property tests are cheap (200ms). E2E tests are expensive (~30s). The strategy allocates most testing to the cheapest layers and reserves the expensive layer for what the cheap layers can't test: real browser integration.

The Test Commands

The full test suite runs through npm scripts:

# Unit tests only (fast)
npm test                    # vitest run — ~600ms

# E2E against dev server
npm run test:e2e:dev        # playwright with TEST_TARGET=dev

# E2E against static build
npm run test:e2e:static     # playwright with TEST_TARGET=static

# Compliance report
npm run test:compliance     # scans @FeatureTest/@Verifies decorators

# Everything
npm run test:smoke          # unit + e2e combined
npm run test:all            # smoke + compliance + visual + a11y + perf

The CI pipeline runs test:all. Local development typically runs npm test (unit only) on every save and test:e2e:dev before committing.

Lessons Learned

1. Test the Machine, Not the Framework

The state machines are tested in isolation. The tests don't import Vitest matchers for DOM assertions. They don't use jsdom. They don't mock window or document. They create a machine, call methods, and check state.

This is only possible because the machines have zero DOM dependencies. If the machines read document.getElementById() or called element.scrollIntoView(), every test would need DOM fixtures. The callback injection pattern (Parts II-IV) is what makes the testing strategy possible.

2. Coverage Is a Ratchet, Not a Target

The thresholds started at whatever the first test suite achieved (around 85%). Each time we added tests, we raised the threshold. The current 98/95/98/99 represents the high-water mark. It can go up but never down.

This prevents "coverage regression" where someone adds untested code and the percentage slowly declines. The threshold is enforced by the build, not by code review.

3. Property Tests Are Cheap Insurance

The property-based.test.ts file is one file with ~400 lines. It runs in ~200ms. It has found 2 real bugs that example-based tests missed (empty-string edge case in slugify, unicode handling in buildHierarchicalSlug). The cost/benefit ratio is excellent.

The key is choosing the right invariants. "Never throws" is always a good invariant. "Idempotent" is good for normalizing functions. "Output length ≤ input length" is good for string transformers. Start with obvious invariants and add more as you discover edge cases.

4. E2E Tests Are Integration Tests

E2E tests don't verify state machine logic -- unit tests do that. E2E tests verify that:

The wiring layer connects machines to the DOM correctly
CSS animations trigger transitionend events that the wiring layer forwards to machines
User events (clicks, scrolls, key presses) reach the correct machine methods
Multiple machines compose correctly through the wiring layer

This is a different testing responsibility. E2E tests are slower, flakier, and harder to debug. But they catch a class of bugs that unit tests can't: integration errors where the glue code is wrong.

5. The Test Suite Is Documentation

Reading the test file for a machine is often the fastest way to understand it. The test names describe the behavior:

✓ should start in idle state
✓ should transition to fetching on navigate
✓ should skip closingHeadings when no headings are open
✓ should wait for transitionEnd in closingHeadings
✓ should abort back to idle when fetch fails
✓ should ignore fetchComplete when not in fetching state
✓ should ignore transitionEnd when not in closingHeadings state

Each test is a sentence. Together, they form a specification. When the specification changes, the tests change first, then the implementation follows.

Series Conclusion

This series walked through 15 state machines, from a 47-line font size manager to a 163-line scroll orchestrator. The arc was deliberate: simple → complex, always the same pattern.

The core idea is not new. Finite state machines predate most of us. What's worth sharing is the specific implementation pattern -- factory functions, closure state, callback injection -- and the infrastructure that makes it work: TypeScript for type safety, esbuild for browser bundling, Vitest for fast tests, and Playwright for integration verification.

The site you're reading right now runs on these 15 machines. Every link click, every scroll, every tooltip, every copy button, every keyboard shortcut -- all driven by pure functions with no DOM dependencies, tested at 98%+ coverage, executing in under a second.

No framework. No xstate. Just closures, callbacks, and guard clauses.

Appendix: All Test Files

For reference, here is the complete list of test files that cover the state machines and related infrastructure:

Unit Tests (test/unit/)

File	Lines	Coverage target
`scroll-spy-machine.test.ts`	310	`src/lib/scroll-spy-machine.ts`
`copy-feedback-state.test.ts`	231	`src/lib/copy-feedback-state.ts`
`spa-nav-state.test.ts`	366	`src/lib/spa-nav-state.ts`
`headings-panel-machine.test.ts`	280	`src/lib/headings-panel-machine.ts`
`keyboard-nav-state.test.ts`	195	`src/lib/keyboard-nav-state.ts`
`property-based.test.ts`	400	Cross-cutting invariants
`build-static.test.ts`	890	`scripts/build-static.js`
`build-static-io.test.ts`	1,140	`scripts/build-static.js` (I/O)
`compliance.test.ts`	180	Requirements scanning
`frontmatter.test.ts`	220	Frontmatter parsing
`link-prefetch.test.ts`	150	Link prefetching

E2E Tests (test/e2e/)

File	Coverage target
`navigation.spec.ts`	SpaNavMachine wiring
`scroll-spy.spec.ts`	ScrollSpyMachine wiring
`copy-buttons.spec.ts`	CopyFeedbackMachine wiring
`font-size.spec.ts`	FontSizeManager wiring
`keyboard-nav.spec.ts`	KeyboardNavState wiring
`theme.spec.ts`	AccentPaletteMachine wiring
`overlays.spec.ts`	Modal priority (Escape chain)
`mobile.spec.ts`	Responsive sidebar behavior
`search.spec.ts`	Search interaction
`hire-modal.spec.ts`	Hire modal flow

Each unit test file covers one machine. Each E2E test file covers one user-facing behavior that exercises one or more machines through the DOM. Together, they form a complete verification of the architecture described in Parts I-IV.

`[` or `Alt+S`	Focus sidebar navigation
`]` or `Alt+C`	Focus main content
`↑` `↓`	Navigate between sidebar items
`Enter`	Open page / toggle section
`Space`	Toggle section expand/collapse
`Escape`	Close overlay / sidebar

`Ctrl+K`	Open search
`?`	Show this help

`Ctrl+=` or `Ctrl+↑`	Increase font size
`Ctrl+−` or `Ctrl+↓`	Decrease font size
`f`	Open console font selector

`Ctrl+⇧+=` or `Ctrl+⇧+↑`	Browser zoom in
`Ctrl+⇧+−` or `Ctrl+⇧+↓`	Browser zoom out
`Ctrl+⇧+0`	Reset browser zoom

`Tab`	Focus a diagram or image
`Enter`	Open full size overlay
`+` `−`	Zoom in / out (in overlay)
`Escape`	Close overlay, return focus

Part VI: Testing Pure State Machines📋

The Payoff📋

The Unit Test Setup Pattern📋

Testing State Transitions📋

Testing the closingHeadings Branch📋

Testing Guard Clauses📋

Guard Clause Testing for CopyFeedbackMachine📋

Guard Clause Testing for PageLoadMachine📋

Testing Pure Helpers📋

Property-Based Testing with fast-check📋

More Property Invariants📋

When Property Tests Caught Bugs📋

Coverage Thresholds as Quality Gates📋

Why 98/95/98/99?📋

Why 100% for build-static.js?📋

E2E Tests with Playwright📋

Dual Target📋

Representative Test: Navigation📋

Representative Test: Copy Button📋

Test Configuration📋

Representative Test: Scroll Spy📋

Representative Test: Keyboard Navigation📋

Test Results and Reporting📋

Requirements Tracking with Decorators📋

What Testing Pure State Machines Enables📋

Speed📋

No Flaky Tests📋

Regression Prevention📋

Design Feedback📋

The Testing Strategy Summarized📋

The Test Commands📋

Lessons Learned📋

1. Test the Machine, Not the Framework📋

2. Coverage Is a Ratchet, Not a Target📋

3. Property Tests Are Cheap Insurance📋

4. E2E Tests Are Integration Tests📋

5. The Test Suite Is Documentation📋

Series Conclusion📋

Appendix: All Test Files📋

Unit Tests (test/unit/)📋

E2E Tests (test/e2e/)📋