Part V: Directory Conventions and Wiki Matrices
Two approaches sit at the bottom of the automation spectrum: organizing tests by folder name and maintaining a traceability table by hand. Both are familiar. Both require zero tooling. Both drift the moment someone skips a step.
Directory-Based Test Organization
What It Is
The simplest structural approach: test files live in directories that mirror the feature structure.
test/
├── navigation/
│ ├── toc-click.spec.ts
│ ├── back-button.spec.ts
│ ├── deep-links.spec.ts
│ └── direct-url.spec.ts
├── search/
│ ├── basic-search.spec.ts
│ ├── scoring.spec.ts
│ └── keyboard-nav.spec.ts
├── accessibility/
│ ├── heading-structure.spec.ts
│ └── keyboard-focus.spec.ts
├── theme/
│ └── mode-switching.spec.ts
└── scroll-spy/
├── heading-tracking.spec.ts
└── active-indicator.spec.tstest/
├── navigation/
│ ├── toc-click.spec.ts
│ ├── back-button.spec.ts
│ ├── deep-links.spec.ts
│ └── direct-url.spec.ts
├── search/
│ ├── basic-search.spec.ts
│ ├── scoring.spec.ts
│ └── keyboard-nav.spec.ts
├── accessibility/
│ ├── heading-structure.spec.ts
│ └── keyboard-focus.spec.ts
├── theme/
│ └── mode-switching.spec.ts
└── scroll-spy/
├── heading-tracking.spec.ts
└── active-indicator.spec.tsThe directory name IS the feature name. Finding tests for a feature means opening the right folder. The convention is intuitive and requires zero infrastructure.
What It Catches
- Discoverability. New team members can find tests for a feature by browsing the directory tree.
- Organizational clarity. Test files have a natural home. Code reviews can check "is this test in the right folder?"
- Test runner grouping. Most runners can filter by directory:
npx playwright test test/navigation/.
Where It Breaks
No formal link. The directory name navigation/ is convention, not a type reference. Nothing validates that a folder called navigation corresponds to a defined feature with specific ACs.
Cross-feature tests don't fit. A test that verifies keyboard navigation within search results touches both search/ and accessibility/. Where does it live? You either duplicate it, pick one folder arbitrarily, or create a cross-feature/ catch-all that defeats the purpose.
No canonical AC list. The directory tells you which tests exist for a feature. It cannot tell you which ACs are missing. Is navigation/ complete with 4 test files? The feature has 8 ACs. Are the other 4 covered inside those files, or missing entirely? The directory structure can't answer this.
Wrong-folder tests compile and run. A test for theme switching placed in navigation/ by mistake produces no error. It passes. The feature mapping is silently wrong.
No build integration. The directory structure is organizational, not enforceable. There's no way to fail a build because a directory is missing tests.
Flat vs. nested debates. Should it be test/navigation/ or test/e2e/navigation/ or test/features/navigation/e2e/? Every team bikesheds this. The directory structure carries no semantic meaning beyond what humans agree on.
When Directories Are Enough
For small projects with a single developer who holds the mental model, directory conventions work fine. The developer knows what's tested and what isn't. The folder structure is a mnemonic, not a system.
They stop working when the team grows, turnover happens, or someone asks "is everything tested?" and the answer requires manual inspection of every folder.
Wiki and README Traceability Matrices
What It Is
A human-maintained table mapping features to tests. Stored in a wiki (Confluence, Notion), a README, or a blog post.
## Feature Traceability Matrix
| Feature | AC | Test File | Test Name | Status |
|---------|-----|-----------|-----------|--------|
| Navigation | TOC click loads page | test/e2e/nav.spec.ts | clicking TOC item loads page | Done |
| Navigation | Back button restores | test/e2e/nav.spec.ts | back button returns to previous page | Done |
| Navigation | Active item highlights | test/e2e/nav.spec.ts | active nav item has highlight class | Done |
| Navigation | Deep link loads | test/e2e/nav.spec.ts | direct URL with hash loads page | Done |
| Navigation | Direct URL loads | test/e2e/nav.spec.ts | direct URL loads correct page | Done |
| Navigation | Bookmarkable URL | — | — | **Missing** |
| Navigation | F5 reload preserves | — | — | **Missing** |
| Navigation | Anchor scrolls | test/unit/scroll.spec.ts | smooth scroll to anchor | Done |
| Search | Basic query | test/e2e/search.spec.ts | typing query shows results | Done |
| Search | Scoring | test/unit/scoring.spec.ts | BM25 scoring ranks correctly | Done |
| ... | ... | ... | ... | ... |## Feature Traceability Matrix
| Feature | AC | Test File | Test Name | Status |
|---------|-----|-----------|-----------|--------|
| Navigation | TOC click loads page | test/e2e/nav.spec.ts | clicking TOC item loads page | Done |
| Navigation | Back button restores | test/e2e/nav.spec.ts | back button returns to previous page | Done |
| Navigation | Active item highlights | test/e2e/nav.spec.ts | active nav item has highlight class | Done |
| Navigation | Deep link loads | test/e2e/nav.spec.ts | direct URL with hash loads page | Done |
| Navigation | Direct URL loads | test/e2e/nav.spec.ts | direct URL loads correct page | Done |
| Navigation | Bookmarkable URL | — | — | **Missing** |
| Navigation | F5 reload preserves | — | — | **Missing** |
| Navigation | Anchor scrolls | test/unit/scroll.spec.ts | smooth scroll to anchor | Done |
| Search | Basic query | test/e2e/search.spec.ts | typing query shows results | Done |
| Search | Scoring | test/unit/scoring.spec.ts | BM25 scoring ranks correctly | Done |
| ... | ... | ... | ... | ... |Why It Exists
This approach persists because it's the most natural thing to do. You have requirements, you have tests, you want to see the mapping — so you make a table. Every team has done this at some point. Many still do.
On this website, a hand-maintained table in a blog post was the exact approach that typed specifications replaced:
"Before typed specifications, the connection between 'what should the website do?' and 'which tests verify it?' was a table in a blog post. Human-maintained. Drift-prone. Unenforceable."
What It Catches
In theory: everything. A well-maintained traceability matrix provides complete visibility into what's tested and what's not. The "Missing" rows in the example above are genuinely useful information.
In practice: it catches whatever the human remembered to update.
Where It Breaks
Immediate drift. The matrix is accurate on the day it's written. A developer adds three tests the next day without updating the table. A week later, a test is renamed. Two weeks later, a test file is moved. The table is now wrong in three ways and nobody knows.
No validation. The table says test/e2e/nav.spec.ts contains "clicking TOC item loads page." Did someone delete that test? Rename it? The table doesn't know. It's a snapshot frozen at the moment of last edit.
Scale burden. This website has 112 acceptance criteria. That's 112 rows. Maintaining 112 rows by hand — checking each one when tests change, updating file paths, marking new gaps — is a full-time job that nobody will do.
False confidence. The matrix says "Done" for 108 of 112 ACs. Management sees 96% coverage. But three of those "Done" entries reference tests that were deleted two months ago. The actual coverage is 93.75%. The matrix gave the wrong number, and there's no way to detect this without manually checking every row.
No build integration. The matrix is documentation, not automation. It cannot fail a build. It cannot be queried programmatically. It cannot be diff'd automatically against the codebase.
Ownership decay. Someone creates the matrix. They leave the team. The matrix becomes "that document nobody updates." New tests are written without matrix entries. Old entries rot. Six months later, someone discovers the matrix and asks "is this still accurate?" The answer is always no.
How Typed Specs Differ
| Dimension | Directory Conventions | Wiki Matrices | Typed Specifications |
|---|---|---|---|
| Effort to create | None | Medium (write the table) | Medium (define abstract classes) |
| Effort to maintain | None | High (update every change) | None (scanner reads source) |
| Can drift | Yes (wrong folder) | Yes (stale entries) | No (generated from code) |
| Cross-feature support | No (one folder per test) | Yes (table rows can cross-reference) | Yes (stacked decorators) |
| Completeness check | No | Manual (visual scan) | Automated (scanner) |
| Build integration | No | No | Yes (exit code) |
| Canonical AC list | No | Human-defined | Abstract methods |
| Validation | None | None | Compile-time + scanner |
The compliance scanner IS the traceability matrix. It reads feature definitions and test references from source code and generates the matrix automatically. It cannot drift because it has no independent state — it recomputes from the codebase every time it runs.
── Feature Compliance Report ──
✓ NAV SPA Navigation + Deep Links 8/8 ACs (100%)
✓ SEARCH Search 5/5 ACs (100%)
✓ A11Y Accessibility 5/5 ACs (100%)
...
Coverage: 112/112 ACs (100%) ── Feature Compliance Report ──
✓ NAV SPA Navigation + Deep Links 8/8 ACs (100%)
✓ SEARCH Search 5/5 ACs (100%)
✓ A11Y Accessibility 5/5 ACs (100%)
...
Coverage: 112/112 ACs (100%)This output is what a traceability matrix looks like when a machine generates it instead of a human.
The Upgrade Path
Directory conventions and wiki matrices aren't wasted work — they're the starting point for typed specs:
- Your directory names become feature IDs.
test/navigation/→NavigationFeaturewithid = 'NAV'. - Your wiki table's AC column becomes abstract methods. "TOC click loads page" →
abstract tocClickLoadsPage(): ACResult. - Your existing tests get decorated. Add
@FeatureTest(NavigationFeature)and@Implementsto what's already there. - The wiki table is replaced by the scanner output. Delete the table. The scanner generates it from code.
On this project, the migration from a hand-maintained table to typed specs took about an hour. The tests already existed. The features were already implicitly defined by the directory structure. The abstract classes and decorators formalized what was already there.
Previous: Part IV: Test Framework Tagging Next: Part VI: API Specification — OpenAPI, contract testing, and why API specs are complementary, not competing.