Part III: Multi-Layer Quality

644 tests across six layers. Every one passes. But which ones verify the search feature? The testing phase built confidence in the code — and exposed a gap in the process.

The Testing Architecture

After the SPA and static generation phases, the site had features worth specifying and output worth testing. The next step was building a comprehensive test suite — not just unit tests, but tests at every level where bugs can hide.

Diagram — The six layers of the test suite, each catching a class of bugs the others miss — 644 tests in total.

Six layers, each catching different classes of bugs.

Layer 1: Unit Tests (Vitest)

Pure logic extracted into importable modules — slugify, frontmatter parsing, search scoring, mermaid config. 235 tests with enforced 100% coverage gates on every module:

// vitest.config.js — coverage thresholds that only ratchet up
coverage: {
  include: ['js/lib/**'],
  thresholds: {
    branches: 100,
    functions: 100,
    lines: 100,
    statements: 100,
  }
}

Unit tests run in under 2 seconds. They catch logic errors immediately.

Layer 2: Property-Based Tests (fast-check)

Hand-written tests check known cases. Property-based tests generate thousands of random inputs and verify invariants hold:

Slugify always produces URL-safe strings
Frontmatter parser handles any valid YAML
Search scoring is monotonically ordered (exact > prefix > word > fuzzy)
HTML escaping is idempotent

fast-check found a real bug: an orphaned h3 branch in the slug hierarchy that no hand-written test covered. That single discovery pushed branch coverage from 75% to 100%.

Layer 3: E2E Tests (Playwright)

54 tests running in real browsers with 6 parallel workers. These catch interaction bugs that unit tests can't:

Clicking a TOC item loads the page with a fade transition
The back button restores the previous page state
Keyboard shortcuts work (? for help, Ctrl+K for search)
The hire modal opens, validates, and submits

E2E tests run against both the dev server (runtime rendering) and the static build (pre-rendered HTML). Same tests, two targets — catching discrepancies between the rendering paths.

Layer 4: Visual Regression (Playwright Screenshots)

229 full-page stitched screenshots across:

Desktop viewport
4 mobile devices (iPhone SE, iPhone 14, Pixel 5, iPad Mini)
4 themes (dark, light, high-contrast dark, high-contrast light)

Any pixel drift from the baseline triggers a failure. This catches CSS regressions that no assertion can describe — the kind of bugs where "it looks wrong" is the only specification.

Layer 5: Accessibility (axe-core + Contrast Matrix)

121 tests including:

axe-core scanner on every page (catches WCAG violations)
Contrast matrix: 8 accent colors x 3 modes x sample pages = 72+ assertions
ARIA role verification on interactive elements
Skip-to-content link presence

The contrast matrix is particularly interesting — it's a combinatorial explosion that would be impossible to test by hand. Each of the 8 curated accent palettes must pass WCAG AA contrast against both dark and light backgrounds, plus high-contrast mode.

Layer 6: Performance

5 tests measuring:

Page load time (first contentful paint)
SPA navigation speed (page-to-page transition)
Scroll spy latency (heading tracking responsiveness)

These prevent speed regressions as features are added.

Coverage Gates as Ratchets

Every coverage metric is a ratchet — it can only tighten, never loosen:

Unit test line coverage: 100% (enforced)
Unit test branch coverage: 100% (enforced)
Visual regression baseline: pixel-perfect (0.01 threshold)
Accessibility: zero critical axe violations (enforced)

If you add code, you add tests. If you break a visual baseline, you approve the new one explicitly. The gates prevent regression.

The Gap That Remained

644 tests. All passing. Six layers. Coverage gates enforced. By any conventional measure, this website is well-tested.

But one question remained unanswerable: "Is the search feature fully tested?"

The tests were organized by technical concern:

test/
├── unit/           ← logic tests
├── e2e/            ← interaction tests
├── a11y/           ← accessibility tests
├── visual/         ← screenshot tests
└── perf/           ← speed tests

Not by feature:

test/
├── search/         ← ??? doesn't exist
├── navigation/     ← ??? doesn't exist
├── accessibility/  ← ??? doesn't exist
└── ...

The search feature had tests scattered across unit/ (search scoring), e2e/ (search interaction), and a11y/ (search accessibility). But there was no single place that said "search has 5 acceptance criteria, and here are the tests for each one."

The link between features and tests was a markdown table in a blog post. Human-maintained. Already drifting.

What Was Needed

Not a reorganization of test directories — that would break the layered architecture. What was needed was a cross-cutting index that linked tests to features regardless of which directory they lived in.

That's what typed specifications provide. Not a replacement for test organization, but a layer on top that answers the feature question.

Previous: Part II: Building the Foundation Next: Part IV: Features as Abstract Classes — the type system becomes the specification language.

`[` or `Alt+S`	Focus sidebar navigation
`]` or `Alt+C`	Focus main content
`↑` `↓`	Navigate between sidebar items
`Enter`	Open page / toggle section
`Space`	Toggle section expand/collapse
`Escape`	Close overlay / sidebar

`Ctrl+=` or `Ctrl+↑`	Increase font size
`Ctrl+−` or `Ctrl+↓`	Decrease font size
`f`	Open console font selector

`Ctrl+⇧+=` or `Ctrl+⇧+↑`	Browser zoom in
`Ctrl+⇧+−` or `Ctrl+⇧+↓`	Browser zoom out
`Ctrl+⇧+0`	Reset browser zoom

`Tab`	Focus a diagram or image
`Enter`	Open full size overlay
`+` `−`	Zoom in / out (in overlay)
`Escape`	Close overlay, return focus

`Ctrl+K`	Open search
`?`	Show this help

Part III: Multi-Layer Quality📋

The Testing Architecture📋

Layer 1: Unit Tests (Vitest)📋

Layer 2: Property-Based Tests (fast-check)📋

Layer 3: E2E Tests (Playwright)📋

Layer 4: Visual Regression (Playwright Screenshots)📋

Layer 5: Accessibility (axe-core + Contrast Matrix)📋

Layer 6: Performance📋

Coverage Gates as Ratchets📋

The Gap That Remained📋

What Was Needed📋