04 — EmitterRegistry + grammar emitter
Article 01 wrote the decorators. Article 02 wrote the extractor. Article 03 froze the IR into a contract. This article opens the production side of the chain: given a LanguageIR, how does @frenchexdev/ide-forge turn it into files that a VSCode extension can ship? The answer is a collection of small strategies — one per artefact kind — coordinated by a registry whose shape is deliberately identical to the one the @frenchexdev/requirements package already uses for its test scaffolders. Same posture, new domain. The first concrete emitter in the series, built in this article, produces the TextMate grammar JSON that gives a language its syntax highlighting; the second half of the article shows where TextMate runs out of road and semantic tokens pick up.
The argument worth stating up front: TextMate grammar is not a parser. It is a regex-based scanner that labels ranges of text with scope names like keyword.control.loop.mydsl or string.quoted.double.mydsl. VSCode's default theme maps those scopes to colours. For the 80% of highlighting a DSL needs — keywords, strings, numbers, identifiers, comments — the scanner is adequate, cheap, and works without any language-server cooperation. For the remaining 20% — a variable used as a type vs a variable used as a value, a function call vs a function definition, a keyword that is conditional on a preceding token — TextMate cannot express the distinction and the grammar starts to lie. That is where VSCode's semantic tokens API, stable since LSP 3.16, takes over. This article builds the grammar emitter and documents the handoff.
The design-series companion is 04 — SOLID in the monorepo patterns: that article argued that every registry in the @frenchexdev/* monorepo follows the same shape — port, register, list, domain-specific entrypoint — and that the shape is not an accident, it is the Open/Closed Principle made mechanical. This article applies the shape to emitters.
REQ-IDEDSL-EMITTER-REGISTRY — the Requirement
REQ-IDEDSL-EMITTER-REGISTRY — The
@frenchexdev/ide-forgepackage shall expose anEmitterRegistryclass that (a) holds a set ofEmitterinstances indexed by a uniquekindstring, (b) rejects double-registration of the samekindwith a thrownError, (c) supportsregister,unregister,list, andrunAll(ir, fs, outDir)operations, (d) delegates all file-writing to the injectedFileSystemport so no emitter touchesnode:fsdirectly, and (e) allows third-party packages to register new emitters at application startup without modifying ide-forge sources. The first concrete emitter,GrammarEmitter, shall consumeIRToken[]from theLanguageIRand produce a TextMate grammar JSON conforming to the schema implied by VSCode's grammars contribution point.Rationale: seven or more artefacts — the TextMate grammar, the
package.jsonmanifest, the snippets JSON, the LSP server scaffold, the extension host entry point, the client configuration, the.vsixpackager — all read the same IR and write to the same output directory. If those emitters are spaghetti-wired in a single 600-line function, adding an eighth emitter means editing that function, which violates OCP and invites merge conflicts in the middle of the critical path. A registry pattern keeps each emitter a narrow strategy the size of a single file, keeps them independently testable against a fakeFileSystem, and makes the set of artefacts a ide-dsl language produces a datum the monorepo can inspect rather than a control-flow fact buried in a build script. Thekinduniqueness constraint mirrors theScaffolderRegistry.byLevelmap inpackages/requirements/src/cli/scaffolders/registry.tsand exists for the same reason: conflicts between emitters targeting the same artefact are bugs, and should surface at registration time, not at write time.Fit criteria:
registry.register(e)throws on duplicatekind;registry.list()returns the registered emitters in insertion order;registry.runAll(ir, fs, outDir)awaits eachemit()sequentially and resolves only after the last write; a fakeFileSystemcapturing writes to an in-memory map suffices to unit-test every emitter withoutnode:fs;GrammarEmitter.emit(ir, fs, outDir)writes exactly one file namedsyntaxes/<languageId>.tmLanguage.jsonwhose content parses as valid TextMate grammar JSON and whosepatterns[]array contains one entry perIRTokenin the input IR.Verification: Test. Refines REQ-IDEDSL-IR-VERSIONED-CONTRACT.
The refinement chain is deliberate. REQ-IDEDSL-IR-VERSIONED-CONTRACT from article 03 said the IR is a versioned, schema-validated, branded-primitive contract. This Requirement refines that into the consumer side: the emitters that read the IR must do so through a uniform port, and the aggregate that coordinates them must be open to extension without modification. Without a registry, the contract on the data side would be accompanied by an ad-hoc soup on the production side, and the guarantees article 03 earned would leak away the moment a new artefact had to be emitted.
The fit criteria are load-bearing. "Throws on duplicate kind" catches the silent overwrite bug: a downstream package stealing the 'grammar' slot from the built-in emitter would go unnoticed until the generated .vsix shipped with an unexpected grammar. "In-memory map suffices" enforces the FileSystem port: if any emitter reaches for node:fs directly, the unit suite fails to compile under the fake injection. The operational difference is between a codebase where adding an emitter is a 50-line PR and one where it is a week of untangling.
FEAT-IDEDSL-04 — the satisfying Feature
// packages/ide-forge/requirements/features/emitter-registry.ts
import { Feature, Priority, Satisfies, type ACResult } from '@frenchexdev/requirements';
import { ReqIdeDslEmitterRegistryRequirement } from '../requirements/req-idedsl-emitter-registry.js';
@Satisfies(ReqIdeDslEmitterRegistryRequirement)
export abstract class IdeForgeEmitterRegistryFeature extends Feature {
readonly id = 'FEAT-IDEDSL-04';
readonly title = 'EmitterRegistry — strategy collection, FileSystem port, first concrete grammar emitter';
readonly priority = Priority.Critical;
// ── Registry shape ──
abstract registryExposesRegisterUnregisterListRunAll(): ACResult;
abstract registryRejectsDuplicateKindAtRegistrationTime(): ACResult;
abstract registryPreservesInsertionOrderInList(): ACResult;
// ── Port discipline ──
abstract emittersReceiveAFileSystemPortAndNeverImportNodeFs(): ACResult;
abstract runAllAwaitsEachEmitSequentially(): ACResult;
// ── Grammar emitter ──
abstract grammarEmitterProducesOneTmLanguageJsonPerLanguage(): ACResult;
abstract grammarEmitterMapsEveryIrTokenToOnePatternsEntry(): ACResult;
abstract grammarEmitterScopeNamesFollowTextMateConvention(): ACResult;
// ── Semantic tokens fallback ──
abstract semanticTokensLegendIsEmittedWhenIrRequestsIt(): ACResult;
abstract grammarPatternAndSemanticScopeAreExclusivePerRange(): ACResult;
}// packages/ide-forge/requirements/features/emitter-registry.ts
import { Feature, Priority, Satisfies, type ACResult } from '@frenchexdev/requirements';
import { ReqIdeDslEmitterRegistryRequirement } from '../requirements/req-idedsl-emitter-registry.js';
@Satisfies(ReqIdeDslEmitterRegistryRequirement)
export abstract class IdeForgeEmitterRegistryFeature extends Feature {
readonly id = 'FEAT-IDEDSL-04';
readonly title = 'EmitterRegistry — strategy collection, FileSystem port, first concrete grammar emitter';
readonly priority = Priority.Critical;
// ── Registry shape ──
abstract registryExposesRegisterUnregisterListRunAll(): ACResult;
abstract registryRejectsDuplicateKindAtRegistrationTime(): ACResult;
abstract registryPreservesInsertionOrderInList(): ACResult;
// ── Port discipline ──
abstract emittersReceiveAFileSystemPortAndNeverImportNodeFs(): ACResult;
abstract runAllAwaitsEachEmitSequentially(): ACResult;
// ── Grammar emitter ──
abstract grammarEmitterProducesOneTmLanguageJsonPerLanguage(): ACResult;
abstract grammarEmitterMapsEveryIrTokenToOnePatternsEntry(): ACResult;
abstract grammarEmitterScopeNamesFollowTextMateConvention(): ACResult;
// ── Semantic tokens fallback ──
abstract semanticTokensLegendIsEmittedWhenIrRequestsIt(): ACResult;
abstract grammarPatternAndSemanticScopeAreExclusivePerRange(): ACResult;
}Ten ACs, four clusters. The registry-shape cluster nails the observable surface of EmitterRegistry. The port-discipline cluster enforces the FileSystem injection — the test that would fail loudest if an emitter reached for node:fs. The grammar-emitter cluster drives the concrete strategy this article builds. The fallback cluster scopes the semantic-tokens handoff the second half of the article describes. The Test verification clause means each AC becomes a @FeatureTest/@Verifies method in the unit suite; article 12 writes them out in full.
The Emitter port
Every emitter in @frenchexdev/ide-forge implements the same two-property interface. Nothing more. The shape is small on purpose — smaller ports are easier to fake, easier to mock, and the Interface Segregation Principle rewards them.
// packages/ide-forge/src/emitters/port.ts
import type { LanguageIR } from '../ir.js';
export interface FileSystem {
writeFile(path: string, content: string): Promise<void>;
mkdir(path: string, opts?: { recursive?: boolean }): Promise<void>;
exists(path: string): Promise<boolean>;
}
export interface Emitter {
readonly kind: string; // unique stable name: 'grammar' | 'manifest' | 'snippets' | 'lsp-server' | 'extension-host' | ...
emit(ir: LanguageIR, fs: FileSystem, outDir: string): Promise<void>;
}// packages/ide-forge/src/emitters/port.ts
import type { LanguageIR } from '../ir.js';
export interface FileSystem {
writeFile(path: string, content: string): Promise<void>;
mkdir(path: string, opts?: { recursive?: boolean }): Promise<void>;
exists(path: string): Promise<boolean>;
}
export interface Emitter {
readonly kind: string; // unique stable name: 'grammar' | 'manifest' | 'snippets' | 'lsp-server' | 'extension-host' | ...
emit(ir: LanguageIR, fs: FileSystem, outDir: string): Promise<void>;
}Three methods on FileSystem, because those three are all any emitter in the series needs. The temptation to fatten the port with readFile, readdir, and the rest of node:fs should be resisted until some emitter actually calls them; Liskov substitutability gets cheaper the smaller the contract. Production injects a thin adapter over node:fs/promises; tests inject a Map<string, string> wrapped in five lines.
kind is a plain string rather than a union type on purpose. The built-ins ide-forge ships use a small fixed vocabulary — grammar, manifest, snippets, lsp-server, extension-host, vsix — but narrowing the type to that union would block a third-party package from registering a custom emitter without forking. String-keyed registries trade a small amount of type safety for the Open/Closed Principle; the runtime duplicate-detection is what makes the trade safe.
The EmitterRegistry class
The registry mirrors packages/requirements/src/cli/scaffolders/registry.ts almost line for line. That file has been in production in the monorepo for months, its shape has survived three refactors, and its unit tests pin the behaviour this article's tests will pin for emitters. Rather than invent a new coordination pattern, @frenchexdev/ide-forge copies the one that works.
// packages/ide-forge/src/emitters/registry.ts
import type { LanguageIR } from '../ir.js';
import type { Emitter, FileSystem } from './port.js';
export class EmitterRegistry {
private readonly emitters = new Map<string, Emitter>();
register(emitter: Emitter): void {
if (this.emitters.has(emitter.kind)) {
throw new Error(`Emitter kind '${emitter.kind}' already registered`);
}
this.emitters.set(emitter.kind, emitter);
}
unregister(kind: string): void {
this.emitters.delete(kind);
}
list(): readonly Emitter[] {
return [...this.emitters.values()];
}
async runAll(ir: LanguageIR, fs: FileSystem, outDir: string): Promise<void> {
for (const e of this.emitters.values()) {
await e.emit(ir, fs, outDir);
}
}
}// packages/ide-forge/src/emitters/registry.ts
import type { LanguageIR } from '../ir.js';
import type { Emitter, FileSystem } from './port.js';
export class EmitterRegistry {
private readonly emitters = new Map<string, Emitter>();
register(emitter: Emitter): void {
if (this.emitters.has(emitter.kind)) {
throw new Error(`Emitter kind '${emitter.kind}' already registered`);
}
this.emitters.set(emitter.kind, emitter);
}
unregister(kind: string): void {
this.emitters.delete(kind);
}
list(): readonly Emitter[] {
return [...this.emitters.values()];
}
async runAll(ir: LanguageIR, fs: FileSystem, outDir: string): Promise<void> {
for (const e of this.emitters.values()) {
await e.emit(ir, fs, outDir);
}
}
}Four methods. Three are one line each. The fourth, runAll, is a three-line for await. The sequential for await is deliberate: emitters that write to non-overlapping paths would be safe to run concurrently, but the performance gain on a six-emitter chain is negligible and concurrent writes make error messages harder to correlate when one fails. Linear-and-predictable beats clever; Promise.all is a five-line change away if a bottleneck shows up.
The comparison to ScaffolderRegistry is worth making explicit. That file's public interface is { get, list, register }, keyed by TestLevel — a closed union over 'unit' | 'functional' | 'e2e' | 'a11y' | 'i18n' | 'visual' | 'perf'. EmitterRegistry keys by an open string instead, because the emitter vocabulary is not closed in the same way. Both registries share the three-method surface, the Map-backed implementation, the "no global state unless you ask for it" factory posture, and the unit-test strategy of constructing a fresh registry in every test.
The grammar emitter
TextMate grammars are old, and their age shows. The format was invented by Allan Odgaard for the TextMate editor on macOS in the mid-2000s; it was shipped as a plist, absorbed by Sublime Text, ported to JSON by VSCode's engineering team, and wrapped by the grammars contribution point in package.json. Twenty years later it is still the primary mechanism by which any editor that inherits the VSCode stack — VSCodium, Cursor, Gitpod, many others — highlights syntax without running a language server. Its longevity is not a sign of perfection; it is a sign that nothing better has dislodged it.
The engine underneath is Oniguruma, a regex library written in C by K. Kosako, originally for Ruby. Its dialect is a superset of PCRE and supports lookbehind, named captures, and \G anchors, which is why TextMate patterns can do more than a naive reader would assume. It is also why TextMate patterns are a notorious source of catastrophic-backtracking bugs: an unbounded greedy quantifier in a begin/end pattern can lock the rendering thread for seconds on pathological input. The grammar emitter in this article sidesteps most of that complexity by sticking to single-line match patterns derived from the IRToken.pattern field, which article 01's decorator surface already constrains to regex literals the linter has validated.
VSCode's grammars contribution point is documented at https://code.visualstudio.com/api/language-extensions/syntax-highlight. The shape it expects is a JSON file referenced from package.json under contributes.grammars, with a scopeName that must start with source. and a patterns[] array whose entries each carry either a match regex plus a name scope, or a begin/end pair for multi-line constructs. The scope naming convention — keyword.control, string.quoted.double, entity.name.function, and so on — is the TextMate scope selectors cheat sheet that every theme in the VSCode ecosystem implicitly depends on. A well-behaved grammar emitter produces scope names that match the convention; a misbehaving one invents funky.custom.scope names that no theme has opinions about, and the user sees their DSL rendered in the default foreground colour.
The emitter itself is straightforward. Read the IR. For every IRToken, produce a patterns[] entry whose match is the token's pattern and whose name is the token's scope. Wrap the list in a { scopeName, patterns } envelope. Serialise to JSON. Write one file.
// packages/ide-forge/src/emitters/grammar.ts
import type { LanguageIR } from '../ir.js';
import type { Emitter, FileSystem } from './port.js';
interface TmGrammar {
readonly scopeName: string;
readonly patterns: readonly TmPattern[];
}
interface TmPattern {
readonly match: string;
readonly name: string;
}
export const GrammarEmitter: Emitter = {
kind: 'grammar',
async emit(ir, fs, outDir) {
const grammar: TmGrammar = {
scopeName: `source.${ir.languageId}`,
patterns: ir.tokens.map((t) => ({
match: t.pattern,
name: `${t.scope}.${ir.languageId}`,
})),
};
const dir = `${outDir}/syntaxes`;
await fs.mkdir(dir, { recursive: true });
await fs.writeFile(
`${dir}/${ir.languageId}.tmLanguage.json`,
`${JSON.stringify(grammar, null, 2)}\n`,
);
},
};// packages/ide-forge/src/emitters/grammar.ts
import type { LanguageIR } from '../ir.js';
import type { Emitter, FileSystem } from './port.js';
interface TmGrammar {
readonly scopeName: string;
readonly patterns: readonly TmPattern[];
}
interface TmPattern {
readonly match: string;
readonly name: string;
}
export const GrammarEmitter: Emitter = {
kind: 'grammar',
async emit(ir, fs, outDir) {
const grammar: TmGrammar = {
scopeName: `source.${ir.languageId}`,
patterns: ir.tokens.map((t) => ({
match: t.pattern,
name: `${t.scope}.${ir.languageId}`,
})),
};
const dir = `${outDir}/syntaxes`;
await fs.mkdir(dir, { recursive: true });
await fs.writeFile(
`${dir}/${ir.languageId}.tmLanguage.json`,
`${JSON.stringify(grammar, null, 2)}\n`,
);
},
};Twenty lines, no control flow beyond a map. The scope-name suffixing — ${t.scope}.${ir.languageId} — is the TextMate convention that makes the DSL-specific scope nestable under the standard selectors: a theme that styles keyword.control styles the DSL's keyword.control.mydsl for free. The recursive mkdir is defensive; the trailing newline is a concession to Unix tooling (diff, git, prettier) that assumes one.
Snapshot tests against the TextMate schema
The grammar emitter is small enough that a unit test against its output is more valuable than a unit test against its internals. The testing posture borrowed from @frenchexdev/requirements is snapshot-flavoured: feed a fixture IR through the emitter, capture the output, compare byte-for-byte against a committed JSON file in test/snapshots/grammar/. The snapshot lives next to the test; a regression in the emitter or a drift in the IR shape surfaces as a reviewable diff in the PR rather than as a runtime failure at .vsix build time.
What the snapshot test does not do is validate the output against the VSCode grammar JSON schema. That schema — maintained alongside the VSCode codebase and published as a versioned JSON Schema file under the vscode-textmate repository — catches a different class of bug: a missing scopeName, a patterns[] element with neither match nor begin/end, a captures block with non-numeric keys. A second test in the suite runs ajv against that schema over the emitter output for every fixture, and fails with a readable pointer when the emitter produces something the grammar loader will refuse. The two tests are complementary: the snapshot catches "did the output change?", the schema catches "is the output valid?". Neither replaces the other.
// packages/ide-forge/test/unit/emitters/grammar.test.ts
import { expect } from 'vitest';
import { readFileSync } from 'node:fs';
import { FeatureTest, Verifies } from '@frenchexdev/requirements';
import { IdeForgeEmitterRegistryFeature } from '../../../requirements/features/emitter-registry.js';
import { GrammarEmitter } from '../../../src/emitters/grammar.js';
import { createFakeFileSystem } from '../helpers/fake-fs.js';
import { fixtureIr } from '../fixtures/minimal-dsl.js';
@FeatureTest(IdeForgeEmitterRegistryFeature)
export class GrammarEmitterTests {
@Verifies('grammarEmitterProducesOneTmLanguageJsonPerLanguage')
async one_file_per_language(): Promise<void> {
const fs = createFakeFileSystem();
await GrammarEmitter.emit(fixtureIr, fs, '/out');
const path = `/out/syntaxes/${fixtureIr.languageId}.tmLanguage.json`;
expect(fs.writes.has(path)).toBe(true);
expect(fs.writes.size).toBe(1);
}
@Verifies('grammarEmitterMapsEveryIrTokenToOnePatternsEntry')
async one_patterns_entry_per_token(): Promise<void> {
const fs = createFakeFileSystem();
await GrammarEmitter.emit(fixtureIr, fs, '/out');
const written = fs.writes.get(`/out/syntaxes/${fixtureIr.languageId}.tmLanguage.json`)!;
const parsed = JSON.parse(written) as { patterns: unknown[] };
expect(parsed.patterns.length).toBe(fixtureIr.tokens.length);
}
}// packages/ide-forge/test/unit/emitters/grammar.test.ts
import { expect } from 'vitest';
import { readFileSync } from 'node:fs';
import { FeatureTest, Verifies } from '@frenchexdev/requirements';
import { IdeForgeEmitterRegistryFeature } from '../../../requirements/features/emitter-registry.js';
import { GrammarEmitter } from '../../../src/emitters/grammar.js';
import { createFakeFileSystem } from '../helpers/fake-fs.js';
import { fixtureIr } from '../fixtures/minimal-dsl.js';
@FeatureTest(IdeForgeEmitterRegistryFeature)
export class GrammarEmitterTests {
@Verifies('grammarEmitterProducesOneTmLanguageJsonPerLanguage')
async one_file_per_language(): Promise<void> {
const fs = createFakeFileSystem();
await GrammarEmitter.emit(fixtureIr, fs, '/out');
const path = `/out/syntaxes/${fixtureIr.languageId}.tmLanguage.json`;
expect(fs.writes.has(path)).toBe(true);
expect(fs.writes.size).toBe(1);
}
@Verifies('grammarEmitterMapsEveryIrTokenToOnePatternsEntry')
async one_patterns_entry_per_token(): Promise<void> {
const fs = createFakeFileSystem();
await GrammarEmitter.emit(fixtureIr, fs, '/out');
const written = fs.writes.get(`/out/syntaxes/${fixtureIr.languageId}.tmLanguage.json`)!;
const parsed = JSON.parse(written) as { patterns: unknown[] };
expect(parsed.patterns.length).toBe(fixtureIr.tokens.length);
}
}Zero describe, zero it. Two ACs, two methods. The @FeatureTest decorator registers the class against the Feature, @Verifies pins each method to a named AC — so the @frenchexdev/requirements compliance report can compute coverage on the spot. A method without a @Verifies annotation is a compliance gap and surfaces in npx requirements compliance --strict as an unverified AC. That is the dog-fooding the project leans on: the same traceability tooling ide-forge ships is the tooling that keeps ide-forge honest.
Semantic tokens — the fallback
TextMate reaches its limit at the syntactic level. A DSL that distinguishes value expressions from type expressions using the same identifier grammar cannot express the distinction in a regex: the identifier Foo is labelled identically whether it occurs on the right of an : (type position) or on the left of an = (value position). The scope would be the same, the colour would be the same, and a user reading the code would have no visual cue about the syntactic role. This is a class of bug TextMate cannot solve, because solving it requires a parser.
VSCode's answer, stable since LSP 3.16 (December 2020), is the semantic tokens API. The language server produces a list of (line, startChar, length, tokenType, tokenModifiers) tuples — one per syntactic element whose role cannot be inferred from lexical shape alone — and VSCode renders them on top of the TextMate highlighting as a second layer. The layering order matters: TextMate runs first, produces a first pass, and semantic tokens override ranges the language server cares to override. Ranges the server does not mention keep their TextMate colour.
The LSP specification for semantic tokens introduces a SemanticTokensLegend — a pair of arrays mapping integer indices to tokenType names (variable, property, function, class, type, parameter, keyword, comment, and eleven others) and to tokenModifier names (declaration, readonly, static, deprecated, and eight others). The server encodes its tokens as integers against the legend and sends them in a compressed delta format; the client decodes, applies the theme, and renders. The compression is the wire protocol's concession to the reality that a 10,000-line file may emit 50,000 semantic tokens, and sending them as JSON objects would swamp the IPC channel.
In @frenchexdev/ide-forge, the semantic-tokens fallback is declared in the IR rather than assumed. The IRLspFeature union from article 03 includes a 'semantic-tokens' variant whose payload is the legend the server will emit. A second emitter (in a later article, not this one) consumes that payload and generates the server-side handler; the grammar emitter need only respect the invariant that a range covered by a semantic-token scope should not also be covered by a conflicting TextMate scope, because the two would layer and produce inconsistent colouring on narrow rendering edges. The fit criterion "grammar pattern and semantic scope are exclusive per range" in FEAT-IDEDSL-04 captures that invariant.
Proposal (write-in-public). The semantic-tokens legend is currently a free-form array in the IR. A stronger discipline would type it against the LSP 3.17 standard token types as a union, with an escape hatch for the server to declare custom types. That would catch at compile time the bug where a DSL author writes
'variable.readonly'as a type (it is a modifier, not a type), which today the emitter would pass through verbatim and which would produce a silent theme miss at runtime. The argument against typing it is that LSP versions will drift and the union will need to follow; the argument for is that the miss is silent and the cost of the drift is a single string-array update per LSP bump. I am inclined toward typing it, but the series has not yet committed. Article 07 (the LSP server scaffold) will close this.
SOLID lens
Single Responsibility. The EmitterRegistry has one job: hold emitters by kind and dispatch to them. It does not decide what files to produce, how to format JSON, or how scopes map to colours. The GrammarEmitter has one job: turn IRToken[] into a TextMate JSON file. It does not decide how VSCode loads grammars or how themes render scopes. The port separation from FileSystem gives the emitter one more axis of single-responsibility: it decides what content to write, not how writing is implemented.
Open/Closed. A third-party package adding a documentation emitter calls registry.register({ kind: 'readme', emit: ... }) and is done. Zero edits to @frenchexdev/ide-forge. The ScaffolderRegistry pattern in packages/requirements/src/cli/scaffolders/registry.ts demonstrates the same property in production: the seven built-in scaffolders ship with the package, but the register() method on the registry lets a consumer add an eighth without forking the registry class. The OCP proof in both cases is the same: find the register method, read its body, observe that it does not enumerate known kinds — it accepts any Emitter whose shape matches the port. That is open for extension. The runAll method dispatches through the map without a switch statement on kind. That is closed for modification.
Liskov Substitution. Any object satisfying the Emitter interface can be registered. The registry does not introspect kind beyond its uniqueness. The GrammarEmitter, the ManifestEmitter from article 05, the .vsix emitter from article 06, and a hypothetical third-party LicenseReportEmitter are all substitutable — swapping one for another at the registry level does not affect correctness of the other emitters.
Interface Segregation. The FileSystem port has three methods. The Emitter port has two members. Neither forces its implementers to depend on capabilities they do not use. A future StreamingEmitter that writes via a Writable stream rather than through writeFile would justify a second narrower port — StreamingFileSystem — rather than fattening the existing one.
Dependency Inversion. @frenchexdev/ide-forge depends on the abstraction FileSystem, not on node:fs. Production wires a NodeFileSystem adapter at the composition root (the build.ts script); tests wire a fake Map-backed one. The emitters themselves have never heard of node:fs and would break the type checker if they imported it.
DRY lens
The registry shape is copied from packages/requirements/src/cli/scaffolders/registry.ts. That is not accidental duplication; it is deliberate alignment. Extracting a generic Registry<Key, Entry> class the two packages could share is tempting in the abstract and consistently a bad idea in practice: the ScaffolderRegistry keys on a closed TestLevel union, offers get(level), and ships with a DEFAULT_REGISTRY constant; the EmitterRegistry keys on an open string, offers runAll(ir, fs, outDir), and defers defaults to the caller. A shared generic would push the two packages into a dance of type gymnastics to account for the differences, and every change in one package would churn the other. The DRY principle is about knowledge, not shape — and the knowledge in question, "a registry collects strategies by key and dispatches to them", is simple enough that it is cheaper to rewrite the Map-backed implementation in each consumer than to share a generic that tries to accommodate both.
What the IR-plus-registry combination does remove, cleanly, is the duplication the grammar emitter would otherwise share with the manifest emitter, the snippets emitter, and the LSP-server emitter. Without the IR, each emitter would have to read the spec file and parse it itself; without the registry, the build script would have to switch-statement over every emitter kind and call its entry point by name. The IR centralises the reading. The registry centralises the dispatch. Each emitter becomes a 20-line strategy against one well-typed input and one narrow output port. That is the duplication the chain removes, and it is why the chain repays its design cost in fewer than three emitters.
What the next article earns
Article 05 is the manifest emitter: the emitter that reads ir.languageId, ir.displayName, the aggregated lspFeatures, executors, and so on, and produces the package.json fragment VSCode needs to see in the extension manifest — contributes.languages, contributes.grammars, contributes.configuration, activationEvents. It is the second concrete emitter, it lives in the same registry as GrammarEmitter, it tests the same way, and it is where the "one artefact per kind" discipline pays off: the manifest emitter never touches the grammar file; the grammar emitter never touches the manifest. Two narrow strategies, one registry, one IR, one output directory. That is the chain from article 03 producing its second downstream artefact under the same contract.
The cross-links back: article 01 defined the decorators whose @Token output article 04's grammar emitter ingests; article 02's extractor produces the exact IRToken[] the emitter maps over; article 03's contract is what makes the mapping total — every IRToken has a name, a pattern, and a scope, because the IR schema says so, and the emitter does not have to defend against the absence of any of them. The chain is starting to show its shape: each article stands on the Requirement the previous articles earned.
The design-series anchor for this article, once more, is 04 — SOLID in the monorepo patterns. If this article felt like the mechanical application of a pattern rather than an original design, that is because it is — and that is the point. The monorepo has four registries already (styles, scaffolders, FSM transitions, renderers). The emitter registry is the fifth. They all look the same, and they all look the same for the same reasons.