Skip to main content
Welcome. This site supports keyboard navigation and screen readers. Press ? at any time for keyboard shortcuts. Press [ to focus the sidebar, ] to focus the content. High-contrast themes are available via the toolbar.
serard@dev00:~/cv

Building a File Watcher That Actually Understands Your Project

Most file watchers rebuild everything when anything changes. This one classifies every file into 12 categories, resolves the minimal pipeline, double-buffers changes during builds, and lets you trigger it with a single keypress. Zero dependencies beyond Node.js.


The Problem

The first version of this site's build system had a toggle-based interactive menu:

--- build options (toggle 1-7, Enter to build, q to cancel) ---
  1 [ ] Rebuild TOC
  2 [ ] Clean public/
  3 [x] Pages only (skip TS/CSS)
  4 [ ] Render Mermaid diagrams
  5 [ ] Generate sitemaps
  6 [ ] Copy fonts
  7 [ ] Copy images

Every rebuild was a manual decision tree. Edit a blog post? Toggle 3, Enter. Change a CSS variable? Toggle... wait, do I need 1 too? What about sitemaps? After the third time I rebuilt everything because I couldn't remember which toggles to set, I knew the build system needed to think for itself.

The goal: save a file, press Enter, and the system figures out the rest.


Why fs.watch() — No chokidar, No Dependencies

Most Node.js projects reach for chokidar the moment they need file watching. It's a fine library. But it's also 3 dependencies deep, and for this project — a static site generator with a well-known directory structure — native fs.watch() does the job.

The trick is handling the rough edges yourself:

export function setupWatchers(
  dirs: string[],
  files: string[],
  io: WatcherIO,
  onChange: (event: string, filename: string) => void,
): WatchHandle[] {
  const handles: WatchHandle[] = [];

  for (const dir of dirs) {
    try {
      handles.push(io.watchDir(dir, true, (event, filename) => {
        if (filename) onChange(event, path.join(dir, filename).replace(/\\/g, '/'));
      }));
    } catch {
      // Graceful degradation: recursive not supported, try non-recursive
      io.warn(`  ⚠ Recursive watch not supported for ${dir}, falling back`);
      try {
        handles.push(io.watchDir(dir, false, (event, filename) => {
          if (filename) onChange(event, path.join(dir, filename).replace(/\\/g, '/'));
        }));
      } catch {
        io.warn(`  ⚠ Could not watch ${dir}`);
      }
    }
  }

  for (const file of files) {
    try {
      handles.push(io.watchFile(file, (event, _filename) => {
        onChange(event, file);
      }));
    } catch {
      io.warn(`  ⚠ Could not watch ${file}`);
    }
  }

  return handles;
}

Two things to note:

  1. Recursive watch fallback. Windows and macOS support { recursive: true } natively. Some Linux kernels don't. The try/catch degrades gracefully instead of crashing.

  2. DI from day one. The function takes a WatcherIO interface, not raw fs. This isn't premature abstraction — it's the difference between testable and untestable. More on this later.

Filtering Editor Noise

Editors generate noise. Vim creates .swp files. Emacs creates ~ backups and .# lock files. VS Code writes .tmp files during save. Without filtering, a single Ctrl+S triggers multiple events:

const TEMP_PATTERNS = [
  /\.swp$/,      // Vim swap files
  /~$/,          // Emacs backup
  /\.tmp$/,
  /\.bak$/,
  /^\.#/,        // Emacs lock files
  /#$/,          // Emacs auto-save
];

function isTempFile(filename: string): boolean {
  const base = filename.includes('/')
    ? filename.slice(filename.lastIndexOf('/') + 1)
    : filename;
  return TEMP_PATTERNS.some(p => p.test(base));
}

Temp files are filtered out before classification. The watcher never sees them.


The State Machine

The watcher has three states:

Diagram
The three-state watcher uses an accumulating state to batch bursts of file changes, so a single rename across five files never triggers five rebuilds.

Why three states instead of the obvious two (watching and building)?

The accumulating state is the key insight. When you rename a CSS class, you touch 5 files in quick succession. A naive two-state machine would trigger 5 builds. The accumulating state collects changes until you explicitly say "build now" (manual mode) or until the changes stop arriving (auto mode with debounce).

This is the same state machine pattern I use in .NET — the FiniteStateMachine framework with its three tiers. Here it's implemented as a closure-based factory:

export function createWatcherMachine(
  callbacks: WatcherCallbacks,
  options: WatcherOptions,
): WatcherMachine {
  let state: WatcherState = 'idle';
  let autoMode: boolean = options.autoMode;
  let changeSet: ChangeSet | null = null;
  let nextChangeSet: ChangeSet | null = null;
  let currentPlan: PipelinePlan | null = null;
  const history: BuildHistoryEntry[] = [];

  function transition(next: WatcherState): void {
    const prev = state;
    state = next;
    callbacks.onStateChange(state, prev);
  }

  // ... event handlers as closures over shared state
}

No classes. No inheritance. Just a function that returns an interface. The state is private, the transitions are guarded, and the side effects are injected via callbacks. Pure state machine logic, zero coupling to the file system.


File Classification — 12 Categories

Every file event hits classifyFile() first — a pure function that maps a relative path to a pipeline category:

export function classifyFile(relativePath: string): ChangeCategory {
  const p = relativePath.replace(/\\/g, '/');

  if (isTempFile(p)) return 'unknown';

  // Build artifacts — never watch
  if (p === 'toc.json') return 'unknown';
  if (p.startsWith('public/')) return 'unknown';
  if (p.startsWith('js/')) return 'unknown';

  // Test sources
  if (p.startsWith('test/') && p.endsWith('.ts')) return 'test-source';

  // Requirement/feature definitions
  if (p.startsWith('requirements/') && p.endsWith('.ts')) return 'requirement';

  // TypeScript sources
  if (p.startsWith('src/') && p.endsWith('.ts')) return 'ts-source';

  // CSS sources
  if (p.startsWith('css/') && p.endsWith('.css')) return 'css-source';

  // Content markdown
  if (p.startsWith('content/') && p.endsWith('.md')) return 'content-md';

  // Content images (non-.md in content/)
  if (p.startsWith('content/') && !p.endsWith('.md')) return 'content-image';

  // Font assets
  if (p.startsWith('fonts/') && p.endsWith('.woff2')) return 'font-asset';

  // Renderer lib
  if (p === 'scripts/lib/page-renderer.ts'
    || p === 'scripts/lib/page-renderer-worker.ts') return 'renderer-lib';

  // CV content
  if (p.startsWith('cv/content/') && p.endsWith('.md')) return 'cv-content';

  // Root files
  if (p === 'index.html') return 'template';
  if (p === 'site.yml') return 'site-config';

  return 'unknown';
}

The function is 40 lines of if-statements. No regex wizardry, no configuration files, no plugin system. Just pattern matching on paths. It's fast, it's obvious, and when I add a new category (like requirement was added later), I add one line.

What Each Category Triggers

Category TOC TS→JS CSS HTML Mermaid Images Fonts CV PDF
ts-source x all
css-source x all
template all
site-config all
renderer-lib all
content-md (modify) x specific if detected
content-md (create) x specific if detected
content-md (delete) x all
content-image x
font-asset x
cv-content x
test-source
requirement

The key insight: a CSS change should never trigger the CV PDF generator, and a font change should never re-render mermaid diagrams. Classification makes this possible — the pipeline only does what's needed.


Pipeline Resolution — The Minimal Rebuild

Accumulated changes feed into resolvePipeline(), which produces a PipelinePlan — a flat object describing exactly what needs to happen:

export interface PipelinePlan {
  buildTs: number;
  tocRebuild: boolean;
  tsTranspile: boolean;
  jsBundle: boolean;
  cssBundle: boolean;
  regenAllHtml: boolean;
  regenSitemaps: boolean;
  regenSpecificPages: string[];
  mermaidPages: string[];
  copyImages: boolean;
  copyFonts: boolean;
  cleanDeletedHtml: string[];
  cleanOrphanedMermaid: string[];
  cvPdf: boolean;
}

The resolution logic walks the change set and flips flags:

for (const [filePath, change] of changeSet.files) {
  switch (change.category) {
    case 'ts-source':
      plan.tsTranspile = true;
      plan.jsBundle = true;
      plan.regenAllHtml = true;
      break;

    case 'css-source':
      plan.cssBundle = true;
      plan.regenAllHtml = true;
      break;

    case 'content-md':
      plan.regenSitemaps = true;
      if (change.kind === 'delete') {
        plan.tocRebuild = true;
        plan.cleanDeletedHtml.push(filePath.replace(/\.md$/, '.html'));
      } else {
        if (change.kind === 'create') plan.tocRebuild = true;
        plan.regenSpecificPages.push(filePath);

        // Mermaid detection — only if we can read the file
        if (readFile) {
          const content = readFile(filePath);
          if (content && /```mermaid/m.test(content)) {
            plan.mermaidPages.push(filePath);
          }
        }
      }
      break;

    case 'cv-content':
      plan.cvPdf = true;
      break;
    // ...
  }
}

Two cascade rules prevent redundant work:

// TOC rebuild means all pages need the new sidebar
if (plan.tocRebuild) plan.regenAllHtml = true;

// Full rebuild supersedes specific page list
if (plan.regenAllHtml) plan.regenSpecificPages = [];

Concrete Examples

Edit one blog post (content/blog/watcher.md modified):

Pipeline: 1 page → sitemaps

Edit a CSS file (css/theme.css modified):

Pipeline: CSS bundle → all HTML

Edit site.yml:

Pipeline: all HTML

Delete a blog post (content/blog/old-post.md deleted):

Pipeline: clean 1 deleted → TOC → all HTML → sitemaps

Create a new blog post with mermaid diagrams:

Pipeline: TOC → 1 page → mermaid SVGs → sitemaps

Edit TypeScript source + a blog post (both pending):

Pipeline: TS→JS → JS bundle → all HTML → sitemaps

The specific page is absorbed by regenAllHtml.

The pipeline is formatted as a human-readable string and displayed live in the TUI before you trigger the build. You always know what's about to happen.


The Double-Buffer — Changes During Builds

Here's a problem most file watchers ignore: what happens when files change while a build is running?

If you're editing a blog post and save, the build starts. While it's running (maybe rendering mermaid SVGs, which takes seconds), you save another file. Without protection, that change is either lost or causes a race condition.

The watcher uses a double-buffer pattern: changes arriving during the building state go into a separate nextChangeSet:

function fileChanged(relativePath: string, eventType: 'rename' | 'change' = 'change'): void {
  const category = classifyFile(normalized);
  if (category === 'unknown') return;

  const exists = options.exists ? options.exists(normalized) : true;
  const kind = resolveFileEventKind(eventType, exists);
  const change: FileChange = { category, kind };

  if (state === 'building') {
    // Queue into next change set — don't touch current build
    if (!nextChangeSet) {
      nextChangeSet = { files: new Map(), firstChangeAt: Date.now(), lastChangeAt: Date.now() };
    }
    addToChangeSet(nextChangeSet, normalized, change);
    return;
  }

  // Normal accumulation
  if (!changeSet) {
    changeSet = { files: new Map(), firstChangeAt: Date.now(), lastChangeAt: Date.now() };
  }
  addToChangeSet(changeSet, normalized, change);

  if (state === 'idle') transition('accumulating');
}

When the build completes, the machine decides what to do with the buffer:

function buildComplete(ok: boolean, durationMs: number, error?: string): void {
  if (ok) {
    callbacks.onBuildComplete(ok, durationMs, currentPlan!);
    changeSet = null;

    if (nextChangeSet && nextChangeSet.files.size > 0) {
      // Promote next to current — another build is needed
      changeSet = nextChangeSet;
      nextChangeSet = null;
      transition('accumulating');
    } else {
      transition('idle');
    }
  } else {
    callbacks.onBuildFailed(error || 'Build failed', currentPlan!);

    // Merge next into current for retry — nothing is lost
    if (nextChangeSet) {
      for (const [path, change] of nextChangeSet.files) {
        addToChangeSet(changeSet!, path, change);
      }
      nextChangeSet = null;
    }
    transition('accumulating');
  }
}

Three outcomes:

  1. Build succeeded, no queued changes → back to idle. Clean slate.
  2. Build succeeded, queued changes exist → promote nextChangeSet to changeSet, transition to accumulating. Another build cycle starts automatically.
  3. Build failed → merge nextChangeSet back into changeSet. The developer sees all pending changes (original + new) and can retry with Enter.

No changes are ever lost. No races. No surprises.

Event Merging

When the same file is modified multiple times, events are merged by significance:

// Kind significance order: create > delete > modify
const KIND_SIGNIFICANCE: Record<FileEventKind, number> = {
  create: 3,
  delete: 2,
  modify: 1,
};

export function mergeFileEvent(existing: FileChange, incoming: FileChange): FileChange {
  if (KIND_SIGNIFICANCE[incoming.kind] > KIND_SIGNIFICANCE[existing.kind]) {
    return incoming;
  }
  return existing;
}

If a file is modified then deleted, the change set records it as deleted. If it's deleted then re-created (common with editors that do atomic saves), it records as created. The most significant event wins.


Manual vs Auto Mode

The watcher defaults to manual mode: changes accumulate, the pipeline preview updates live, but nothing builds until you press Enter.

Why not auto by default? Because developers rarely change just one file. A rename-refactor touches 5 files in 10 seconds. Auto mode with a 2-second debounce would trigger a build after the second file, then another after the fifth. Manual mode lets you accumulate everything, review the pipeline preview, and build once.

Auto mode is there for when you want it — writing prose, tweaking CSS values, iterating on a single file. Toggle it with a:

function toggleAutoMode(): void {
  autoMode = !autoMode;
  if (!autoMode) {
    callbacks.clearDebounceTimer();
  } else if (state === 'accumulating') {
    callbacks.clearDebounceTimer();
    callbacks.startDebounceTimer(options.debounceMs, () => autoBuild());
  }
}

The debounce timer resets on every file change. Once 2 seconds of silence pass, the build triggers automatically. The debounce is configurable:

npm run work -- watch --auto --debounce 5000

The TUI — A Build Cockpit

The watcher isn't a silent background process. It's an interactive terminal UI that shows you exactly what's happening.

Idle State

┌─ watch ──────────────────────────────────────────────────────┐
│ Watching: src/ css/ content/ fonts/ test/ requirements/      │
│           index.html site.yml                                │
│ Mode: manual                        Status: idle             │
│ Server: http://localhost:4000                                │
│                                                              │
│ [Enter] Build  [c] Clear  [a] Auto  [h] History  [q] Quit    │
│ [t] Unit tests  [e] E2E tests  [r] Compliance                │
└──────────────────────────────────────────────────────────────┘

Accumulating State

┌─ watch ──────────────────────────────────────────────────────┐
│ 3 change(s) pending                              1.2s ago    │
│                                                              │
│   [CSS]   css/theme.css                                      │
│   [MD+]   content/blog/watcher.md                            │
│   [IMG]   content/blog/watcher-screenshot.png                │
│                                                              │
│ Pipeline: CSS bundle → all HTML → copy images                │
│                                                              │
│ [Enter] Build  [c] Clear  [a] Auto  [q] Quit                 │
│ [t] Unit tests  [e] E2E tests  [r] Compliance                │
└──────────────────────────────────────────────────────────────┘

Each file gets a colored category label. Created files get a + suffix ([MD+]), deleted files get a ([MD−]). The pipeline preview updates in real time as changes accumulate — you always know what's about to happen before pressing Enter.

Build Output

> Building: CSS bundle → all HTML → copy images

  [1/3] CSS bundle...
  [1/3] CSS bundle  done
  [2/3] Full build...
  [2/3] Full build  done
  [3/3] Copy images...
  [3/3] Copy images  done

> Build complete in 1.34s

Keyboard Controls

Key Action
Enter / b Trigger build
c Clear accumulated changes
a Toggle auto/manual mode
t Run unit tests (vitest)
e Run E2E tests (playwright)
r Run compliance report
h Show build history
q Quit

The side commands (t, e, r) run without leaving the watcher. You don't need a second terminal for tests — press t, see the results, keep watching.

Build History

Press h to see the last 10 builds:

  Build History:
    14:23:15  1.34s  ok    CSS bundle → all HTML → copy images
    14:20:02  0.45s  ok    1 page → sitemaps
    14:18:30  3.21s  ok    TOC → all HTML → mermaid SVGs → sitemaps
    14:15:12  0.12s  fail  TS→JS → JS bundle → all HTML

Timestamps, durations, pass/fail, and the pipeline summary for each build. Enough to debug "what just happened" without scrolling through terminal output.


Build Execution — Phased Steps

The build executor translates a PipelinePlan into sequential steps:

async function executePlan(plan: PipelinePlan, io: ConsoleIO, config: WorkConfig) {
  const steps: Array<{ label: string; fn: () => Promise<void> | void }> = [];

  if (plan.tocRebuild) {
    steps.push({ label: 'TOC rebuild', fn: () => {
      const { buildToc } = require('../../build-toc');
      buildToc();
    }});
  }

  if (plan.cleanDeletedHtml.length > 0) {
    steps.push({ label: `Clean ${plan.cleanDeletedHtml.length} deleted`, fn: () => {
      for (const htmlPath of plan.cleanDeletedHtml) {
        const full = path.join(ROOT, 'public', htmlPath);
        try { fs.unlinkSync(full); } catch {}
      }
    }});
  }

  // ... copyFonts, copyImages, cvPdf ...

  const needsFullBuild = plan.tsTranspile || plan.cssBundle || plan.regenAllHtml;
  if (needsFullBuild || plan.regenSpecificPages.length > 0) {
    steps.push({ label: needsFullBuild ? 'Full build' : `${plan.regenSpecificPages.length} page(s)`,
      fn: async () => {
        const { main: buildMain, parseFlags } = require('../../build-static');
        const flagArgs: string[] = ['--no-clean', `--build-ts=${plan.buildTs}`];
        if (!plan.tsTranspile && !plan.cssBundle) flagArgs.push('--pages-only');
        if (plan.regenSpecificPages.length > 0)
          flagArgs.push(`--only=${plan.regenSpecificPages.join(',')}`);
        if (plan.mermaidPages.length > 0) flagArgs.push('--mermaid');
        const flags = parseFlags(flagArgs);
        await buildMain(flags);
      }
    });
  }

  // Execute sequentially with progress
  let stepIdx = 0;
  for (const step of steps) {
    stepIdx++;
    io.writeLine(`  [${stepIdx}/${steps.length}] ${step.label}...`);
    try {
      await step.fn();
      io.writeLine(`  [${stepIdx}/${steps.length}] ${step.label}  done`);
    } catch (err: any) {
      return { ok: false, durationMs: performance.now() - start, error: err?.message };
    }
  }

  return { ok: true, durationMs: performance.now() - start };
}

esbuild Persistent Watch

JavaScript bundling runs as a separate persistent watcher via esbuild. When the file watcher starts, it also starts esbuild in watch mode:

const esbuildWatch = await buildJsWatch();

Two watch loops cooperate: fs.watch() handles file classification and pipeline resolution, while esbuild handles the fast incremental JS bundling. When the watcher triggers a full build, esbuild has already done its job — the JS is ready.

--serve Flag

npm run work -- watch --serve

Starts a static file server on port 4000 alongside the watcher. Edit a file, press Enter, refresh the browser. The full inner loop without leaving the terminal.


Testing Pure Functions — DI for fs.watch

The hardest part of file watching isn't the watching — it's testing it. fs.watch() is callback-based, OS-dependent, and non-deterministic. How do you write reliable tests?

Step 1: Extract all I/O behind an interface.

export interface WatcherIO {
  watchDir(dir: string, recursive: boolean,
    cb: (event: string, filename: string | null) => void): { close: () => void };
  watchFile(file: string,
    cb: (event: string, filename: string | null) => void): { close: () => void };
  exists(path: string): boolean;
  readFile(path: string): string | null;
  log(...args: unknown[]): void;
  warn(...args: unknown[]): void;
}

Step 2: Provide a fake for tests.

export function createFakeWatcherIO(opts?: { throwOnRecursive?: boolean }): {
  io: WatcherIO;
  watchers: Array<{ path: string; recursive: boolean; type: 'dir' | 'file' }>;
  warnings: string[];
  simulateEvent(watchedPath: string, event: string, filename: string): void;
} {
  const watchers = [];
  const callbacks = new Map();

  return {
    io: {
      watchDir(dir, recursive, cb) {
        if (opts?.throwOnRecursive && recursive)
          throw new Error('recursive not supported');
        watchers.push({ path: dir, recursive, type: 'dir' });
        callbacks.set(dir, cb);
        return { close: () => {} };
      },
      // ...
    },
    watchers,
    warnings,
    simulateEvent(watchedPath, event, filename) {
      callbacks.get(watchedPath)?.(event, filename);
    },
  };
}

With the fake, tests can simulateEvent('src', 'change', 'app.ts') and assert that the machine transitions to accumulating with the correct classification. No file system, no timers, no race conditions.

Step 3: Keep the logic pure.

The critical functions — classifyFile, resolveFileEventKind, mergeFileEvent, resolvePipeline, formatPipelineSummary — are all pure functions exported directly. They take data in, return data out. No I/O, no state, no callbacks. Test them with table-driven tests and never worry about flakiness.

The state machine itself uses the same pattern as the .NET FiniteStateMachine framework: side effects are injected as callbacks, not hardcoded. The machine doesn't know about fs.watch(), console.log, or setTimeout. It knows about states, transitions, and events. Everything else is someone else's problem.


From Toggle Menu to Single Keypress

The old workflow:

1. Save file
2. Alt-tab to terminal
3. Run npm run build
4. Remember which toggles to set
5. Toggle 3... no wait, also 1... and maybe 5
6. Press Enter
7. Wait
8. Repeat

The new workflow:

1. Save file
2. Press Enter

The watcher handles classification, pipeline resolution, incremental builds, mermaid detection, TOC updates, image copying, font syncing, and error recovery. It runs tests on demand. It keeps a build history. It double-buffers changes so nothing is lost. And it does all of this with native fs.watch(), a closure-based state machine, and a handful of pure functions.

npm run work -- watch --serve

That's the entire developer inner loop.

⬇ Download