Skip to main content
Welcome. This site supports keyboard navigation and screen readers. Press ? at any time for keyboard shortcuts. Press [ to focus the sidebar, ] to focus the content. High-contrast themes are available via the toolbar.
serard@dev00:~/cv

Part III: Multi-Layer Quality

644 tests across six layers. Every one passes. But which ones verify the search feature? The testing phase built confidence in the code — and exposed a gap in the process.

The Testing Architecture

After the SPA and static generation phases, the site had features worth specifying and output worth testing. The next step was building a comprehensive test suite — not just unit tests, but tests at every level where bugs can hide.

Diagram

Six layers, each catching different classes of bugs.

Layer 1: Unit Tests (Vitest)

Pure logic extracted into importable modules — slugify, frontmatter parsing, search scoring, mermaid config. 235 tests with enforced 100% coverage gates on every module:

// vitest.config.js — coverage thresholds that only ratchet up
coverage: {
  include: ['js/lib/**'],
  thresholds: {
    branches: 100,
    functions: 100,
    lines: 100,
    statements: 100,
  }
}

Unit tests run in under 2 seconds. They catch logic errors immediately.

Layer 2: Property-Based Tests (fast-check)

Hand-written tests check known cases. Property-based tests generate thousands of random inputs and verify invariants hold:

  • Slugify always produces URL-safe strings
  • Frontmatter parser handles any valid YAML
  • Search scoring is monotonically ordered (exact > prefix > word > fuzzy)
  • HTML escaping is idempotent

fast-check found a real bug: an orphaned h3 branch in the slug hierarchy that no hand-written test covered. That single discovery pushed branch coverage from 75% to 100%.

Layer 3: E2E Tests (Playwright)

54 tests running in real browsers with 6 parallel workers. These catch interaction bugs that unit tests can't:

  • Clicking a TOC item loads the page with a fade transition
  • The back button restores the previous page state
  • Keyboard shortcuts work (? for help, Ctrl+K for search)
  • The hire modal opens, validates, and submits

E2E tests run against both the dev server (runtime rendering) and the static build (pre-rendered HTML). Same tests, two targets — catching discrepancies between the rendering paths.

Layer 4: Visual Regression (Playwright Screenshots)

229 full-page stitched screenshots across:

  • Desktop viewport
  • 4 mobile devices (iPhone SE, iPhone 14, Pixel 5, iPad Mini)
  • 4 themes (dark, light, high-contrast dark, high-contrast light)

Any pixel drift from the baseline triggers a failure. This catches CSS regressions that no assertion can describe — the kind of bugs where "it looks wrong" is the only specification.

Layer 5: Accessibility (axe-core + Contrast Matrix)

121 tests including:

  • axe-core scanner on every page (catches WCAG violations)
  • Contrast matrix: 8 accent colors x 3 modes x sample pages = 72+ assertions
  • ARIA role verification on interactive elements
  • Skip-to-content link presence

The contrast matrix is particularly interesting — it's a combinatorial explosion that would be impossible to test by hand. Each of the 8 curated accent palettes must pass WCAG AA contrast against both dark and light backgrounds, plus high-contrast mode.

Layer 6: Performance

5 tests measuring:

  • Page load time (first contentful paint)
  • SPA navigation speed (page-to-page transition)
  • Scroll spy latency (heading tracking responsiveness)

These prevent speed regressions as features are added.

Coverage Gates as Ratchets

Every coverage metric is a ratchet — it can only tighten, never loosen:

  • Unit test line coverage: 100% (enforced)
  • Unit test branch coverage: 100% (enforced)
  • Visual regression baseline: pixel-perfect (0.01 threshold)
  • Accessibility: zero critical axe violations (enforced)

If you add code, you add tests. If you break a visual baseline, you approve the new one explicitly. The gates prevent regression.

The Gap That Remained

644 tests. All passing. Six layers. Coverage gates enforced. By any conventional measure, this website is well-tested.

But one question remained unanswerable: "Is the search feature fully tested?"

The tests were organized by technical concern:

test/
├── unit/           ← logic tests
├── e2e/            ← interaction tests
├── a11y/           ← accessibility tests
├── visual/         ← screenshot tests
└── perf/           ← speed tests

Not by feature:

test/
├── search/         ← ??? doesn't exist
├── navigation/     ← ??? doesn't exist
├── accessibility/  ← ??? doesn't exist
└── ...

The search feature had tests scattered across unit/ (search scoring), e2e/ (search interaction), and a11y/ (search accessibility). But there was no single place that said "search has 5 acceptance criteria, and here are the tests for each one."

The link between features and tests was a markdown table in a blog post. Human-maintained. Already drifting.

What Was Needed

Not a reorganization of test directories — that would break the layered architecture. What was needed was a cross-cutting index that linked tests to features regardless of which directory they lived in.

That's what typed specifications provide. Not a replacement for test organization, but a layer on top that answers the feature question.


Previous: Part II: Building the Foundation Next: Part IV: Features as Abstract Classes — the type system becomes the specification language.