Skip to main content
Welcome. This site supports keyboard navigation and screen readers. Press ? at any time for keyboard shortcuts. Press [ to focus the sidebar, ] to focus the content. High-contrast themes are available via the toolbar.
serard@dev00:~/cv

Part 03 — VirtFS, fixpoint, and the backward edge

The previous article toured the example. This article opens the engine. Three mechanics carry the entire multi-stage feedback property: a virtual filesystem that holds emissions in memory, a fixpoint loop that terminates when an iteration produces no new emissions, and the kind of dependency edge between stages that forces the loop to take more than one iteration to converge. All three are visible in roughly two hundred lines of code in packages/ts-codegen-pipeline/src/, and all three are exercised by the canonical example in exactly the configuration that gives the property its name: a backward edge. After this article the reader holds the engine's mental model.

A reader who has worked with Roslyn will recognise this material. The mapping is precise: virtFS is the TS analogue of Roslyn's Compilation as seen by generators — the single in-memory artefact that holds both the user's source and every stage's emissions, with reads and writes funnelled through one API. The fixpoint loop is the analogue of Roslyn's multi-pass driver. The backward-edge concept does not have a Roslyn-specific name, but it is the property that distinguishes a single-pass templating engine from a feedback codegen pipeline; if your generators have no backward edges, you do not need a fixpoint loop. The argument for keeping the loop anyway — even when most pipelines do not need it — is in the closing section.

VirtFS as an in-memory mirror

The runner constructs a ts-morph Project configured with useInMemoryFileSystem: true, seeds it with the user's source files (every TS file resolved by the consumer's tsconfig.json, minus anything ending in .generated.ts), and wraps it in a VirtFs handle. The handle is the only API a generator calls.

The relevant slice of virtfs/internal.ts:

const virtFs: VirtFs = {
  has(relPath: string): boolean {
    if (state.emissions.has(relPath)) return true;
    const abs = path.join(state.outDir, relPath);
    return state.project.getSourceFile(abs) !== undefined;
  },

  read(relPath: string): string {
    const emitted = state.emissions.get(relPath);
    if (emitted) return emitted.contents;
    const abs = path.join(state.outDir, relPath);
    const sf = state.project.getSourceFile(abs);
    if (!sf) throw new Error(`VirtFs: relPath not found: ${relPath}`);
    return sf.getFullText();
  },

  addSource(file: GeneratedFile): AddSourceResult { /* ... */ },
};

Three observations.

The first is that has and read look in two places: the emissions Map (in-memory writes by this run's generators) and the ts-morph Project's getSourceFile. From the perspective of a generator they are the same place — what the project contains, including what has been emitted in this run. The split is an implementation detail; the API exposes a single virtual filesystem in which user sources and generated files coexist.

The second is that there is no delete and no truncate. The only mutation is addSource. Strict additivity (Part 04) is not a convention enforced by code review; it is a property of the API surface. A generator cannot remove a file it doesn't like; it cannot empty a file someone else wrote; it can only emit. The contract is enforced by the absence of operations.

The third is that addSource writes both to the emissions Map and to the ts-morph Project (virtfs/internal.ts:103-110):

state.emissions.set(file.relPath, file);
const absPath = path.join(state.outDir, file.relPath);
const existingSf = state.project.getSourceFile(absPath);
if (existingSf) {
  existingSf.replaceWithText(file.contents);
} else {
  state.project.createSourceFile(absPath, file.contents, { overwrite: true });
}

That dual write is what makes cross-stage feedback work. After stage 20 emits User.repository.generated.ts, that file is part of the ts-morph Project — the next generator's ctx.project.getSourceFile(...) resolves it, declarations within it are typed against the rest of the project, and downstream scans see it. The emissions Map is the journal of what was emitted; the Project is the live AST. Both must be consistent for a stage 50 like the mapper to read what stage 30 emitted in the same iteration.

The fixpoint loop in fifty lines

The runner is in runner/run-fixpoint.ts. The loop body is fifty lines, with one circuit-breaker (MAX_ITERATIONS, default 16) and one termination condition (anyEmitted === false).

for (; iteration < max; iteration++) {
  handle.setCurrentIteration(iteration);
  const delta = handle.buildDelta(iteration);
  let anyEmitted = false;
  for (const gen of generators) {
    const ctx: GenerationContext = { /* project, virtFs, delta, … */ };
    const res = await safeExecute(gen, ctx);
    allDiagnostics.push(...res.diagnostics);
    if (res.emittedSomething) anyEmitted = true;
  }

  if (allDiagnostics.some(d => d.severity === 'error')) {
    return { ok: false, iterations: iteration + 1, diagnostics: allDiagnostics };
  }

  if (!anyEmitted) {
    /* fixpoint reached → commit, return ok: true */
  }
}

Three observations.

The first is the ordering invariant on line 60 of the runner: [...cfg.generators].sort((a, b) => a.id.localeCompare(b.id)). Generators are invoked in lexicographic order of their id. This is a deliberate choice — see Part 05 for why determinism in the order of invocation matters. It is also the choice that makes backward edges interesting: a generator with a lex-earlier id runs before a generator with a lex-later id, regardless of any data dependency between them. The backward edge in the example is exactly the case where data flows opposite to the lex order.

The second is the termination condition on line 87: if (!anyEmitted) { /* fixpoint */ }. The loop terminates when an entire iteration runs every generator and no generator's addSource returned newOrChanged: true. This is what fixpoint means precisely: a state in which any further iteration would change nothing. Iteration counts are not part of the termination condition; MAX_ITERATIONS is a circuit breaker that aborts the run with SG0030 if the loop fails to converge. Hand-counting iterations is not a configuration knob and is not exposed as one.

The third is the failure model on line 82: if (allDiagnostics.some(d => d.severity === 'error')) { return { ok: false, ... } }. Any error from any generator at any iteration aborts the loop without committing to disk. The SG0020 divergence error from virtFS, the SG0040 path-escape error, an exception thrown by a generator (caught and turned into SG0050 by safeExecute on line 122), all behave the same way: the run fails, the disk is left in its prior state, and the operator gets a diagnostic list pointing at the producer that broke. Part 06 walks the commit phase that this failure model gates.

What ctx.delta carries — and why it exists

Every generator receives a GenerationContext that carries, among other things, a delta: VirtFsDelta. The delta is built once per iteration (virtfs/internal.ts:141-156):

buildDelta(currentIteration: number): VirtFsDelta {
  const journalCopy = state.journal.slice();
  const lastIter = currentIteration - 1;
  return {
    emittedSinceStart: Object.freeze(journalCopy),
    emittedLastIteration: Object.freeze(
      journalCopy.filter(r => r.emittedAtIteration === lastIter),
    ),
    filterEmittedSince(iteration: number) {
      return Object.freeze(journalCopy.filter(r => r.emittedAtIteration >= iteration));
    },
  };
}

A generator can ignore the delta entirely. Most generators in the example do — they do their work by scanning the project (via ctx.project) and calling ctx.virtFs.has/read for cross-stage dependencies. The delta is the time-aware view, available for the cases where a generator wants to know what changed last iteration: useful for diagnostics, for caching, for early termination of expensive work that has nothing new to consume. None of the ten generators in the example uses the delta non-trivially in the current revision; the API is in place for a future generator that needs it.

The journal itself is append-only and content-equal under idempotent re-emission (virtfs/internal.ts:81-86) — emitting the same relPath with byte-identical contents from the same producer is a silent no-op and does not append to the journal. That property keeps the delta clean across iterations: a stable producer that always emits the same content shows up once in the journal, not once per iteration.

The backward edge in the example

The example's backward edge lives at stage 00-entity.registry. The generator scans the project for entities (a normal forward action) but only emits if the partner repository file is present in virtFS:

async execute(ctx: GenerationContext): Promise<GenerationResult> {
  const allEntities = scanEntities(ctx.project);
  const ready = allEntities.filter(e =>
    ctx.virtFs.has(`${e.className}.repository.generated.ts`),
  );
  if (ready.length === 0) {
    return { emittedSomething: false, diagnostics: [] };
  }
  const body = renderRegistry(ready);
  const r = ctx.virtFs.addSource({
    relPath: 'entity-registry.generated.ts',
    contents: withBanner(body, { producerId: this.id }),
    producerId: this.id,
  });
  return { emittedSomething: r.newOrChanged, diagnostics: r.diagnostics ?? [] };
}

Three things make this a backward edge rather than a normal forward read.

The first is the lex order: 00-entity.registry < 20-entity.repository. The runner sorts generators by id and invokes them in order. At iteration 0, when 00-entity.registry runs, 20-entity.repository has not yet run this iteration. The repository files are not in virtFS; the ready filter returns empty; the registry emits nothing.

The second is the lifting of the dependency to the next iteration. After iteration 0 finishes — having executed all ten generators in order, with stage 0 a no-op and stages 1–9 emitting — the runner checks anyEmitted. It is true (stages 1–9 emitted). The loop runs iteration 1. At iteration 1, when 00-entity.registry runs again, the repositories emitted in iteration 0 are in virtFS. The ready filter finds three entries. The registry emits.

The third is the stable second-order effect on the barrel. Stage 99 (99-index.barrel) emitted in iteration 0 without the registry — the registry didn't exist yet. At iteration 1, the registry now exists, and the barrel re-emits to absorb it. (The barrel is producer-id-stable, content-different — see virtfs/internal.ts:97-99: "Same producer updating its own emission with new contents — allowed.") anyEmitted is true again at the end of iteration 1.

Iteration 2 runs all ten generators a third time. The registry emits again — same producer, same content as iteration 1 → idempotent no-op. The barrel emits again — same producer, same content → idempotent no-op. Every other stage produces byte-identical output to its iteration-1 emission. anyEmitted is false. Fixpoint reached, iteration count is 3.

The iteration count of three is asserted by the test method fixpointReachedInThreeIterations() in requirements/features/industrial-weaving.feature.ts. The corresponding fit criterion in req-industrial-pipeline.ts reads: "outcome.iterations === 3 and outcome.ok === true after a run on the canonical project." Three is not arbitrary; it is the smallest iteration count that exercises both same-iteration ts-morph visibility (sibling stages see each other within iter N) and cross-iteration backward edges (a lex-earlier stage consumes a lex-later stage's output via virtFS, forcing iter N+1).

Why have a fixpoint loop at all?

A fair objection at this point: most pipelines do not have backward edges. If every generator's id happens to be assigned in dependency order, the pipeline converges in one iteration and the fixpoint loop is overhead. Why pay for it?

Three reasons, in increasing weight.

The first is that dependency order is not always known up-front. A pipeline that grows organically from three stages to ten will pick up backward edges by accident — a new stage gets inserted, lex-sorts in an inconvenient place, and the implicit expectation that "everyone's input exists by the time my id sorts" stops holding. A fixpoint loop tolerates this; a one-pass pipeline with hand-coded dependency order does not. The loop is robust to refactoring.

The second is that backward edges are sometimes the right design. The registry generator in the example could have been assigned id: '90-entity.registry' to run after the repositories. It was not, deliberately: the registry is conceptually upstream (it advertises which entities exist; consumers query it), and putting it lex-late conflicts with the mental model of stage zero as a fact establishment phase. A backward edge says "I am the upstream, even though my data dependency is downstream"; the fixpoint loop is what makes that legal.

The third is Roslyn parity. The Roslyn API ships with the multi-pass driver because it has to: in the Roslyn world, a generator can emit code that is itself decorated for another generator (or for itself) to pick up on a subsequent pass, and the C# compiler's iteration count is the same circuit breaker this library's MAX_ITERATIONS is. Inheriting the multi-pass shape is what lets the analogue table from Part 01 hold at the level of API contract rather than at the level of similar-looking syntax. A library called "Roslyn-style" without a fixpoint loop would be Roslyn-style in syntax only.

What virtFS deliberately is not

Three properties a reader might expect virtFS to have, and the reasons it does not.

First, no transactional write/rollback at the virtFS level. A generator's addSource either succeeds (and the emission lives in virtFS) or fails (and virtFS is unchanged); there is no "begin/commit/abort" for a group of emissions inside a single generator. The atomicity that matters is at the whole run level — see Part 06. Per-generator transactions would complicate the API for no benefit; if one of a generator's emissions fails, the run fails, period.

Second, no asynchronous coordination between generators. Generators within an iteration run sequentially, in lex order. Two generators could in principle run in parallel if their dependencies allowed, but the runner does not exploit that — sequential execution is simpler, deterministic, and fast enough for the example's scale. A future revision might introduce parallelism behind a config flag; the API permits it but does not require it.

Third, no streaming or incremental reads of user code. ts-morph is loaded once at startup with the full project; subsequent iterations re-use the same Project handle with whatever emissions have been added. There is no equivalent of Roslyn's IIncrementalGenerator (which caches at the level of individual syntax-node hashes). The trade-off is argued in Part 12: TS workloads at the scale this library targets do not yet need that complexity, and the absence keeps the engine simple.

Bridge

Part 04 returns to the additive contract. With ten producers writing into a shared virtual filesystem, the question becomes: what stops them from stepping on each other? The answer is a small set of explicit error codes — SG0020, SG0040, prune-on-delete semantics — and the contract that producer identity is the unit of conflict resolution. Every behaviour walked in this article assumes that contract; Part 04 makes it explicit.

The Feature for this article is FEAT-TSGEN-03 in assets/features.ts. Acceptance criteria: virtFS as in-memory ts-morph project explained; fixpoint as termination criterion contrasted with counters; three-iteration convergence walked stage by stage; backward-edge lex-ordering mechanic explained. Each section above maps to one of those ACs.

⬇ Download