Skip to main content
Welcome. This site supports keyboard navigation and screen readers. Press ? at any time for keyboard shortcuts. Press [ to focus the sidebar, ] to focus the content. High-contrast themes are available via the toolbar.
serard@dev00:~/cv

Part 06 — Language micro-DSL: identity, scope, extensions, comments, brackets

Part III opens here, with the smallest of the fourteen micro-DSLs. The smallness is intentional: starting with Language fixes the eight-section anatomy that the next thirteen articles follow without restating it, and demonstrates that small and load-bearing are not in tension. Every other micro-DSL — Syntax, Completion, Hover, Diagnostics, all of them — references the language identity declared here when it asks which language are we serving? The Language micro-DSL has one job, does it well, ships in fewer lines of code than its decorator file's documentation, and gets out of the way.

The eight sections — Concern, the Surface, Kernel boundary, Emitted artefacts, Composition with peers, MPS aspect referent, Boundary justification, Requirements — are fixed for the rest of Part III. A reader who reads article 06 carefully reads them by name; subsequent articles lean on the same shape so the reader can scan with practiced eyes.

Concern

The VSCode editor needs to know, for any open file, which language is this? The answer drives a chain of downstream behaviours: which TextMate grammar to apply for highlighting, which language server to start, which snippets to expose, which when clauses to satisfy in the command palette, which file icon to render. The mechanism is the contributes.languages block in package.json plus, optionally, a language-configuration.json file declaring brackets, comments, indentation patterns, on-enter rules.

Hand-writing this declaration for every new DSL is mechanical, error-prone, and disconnected from the DSL's metamodel. Mechanical, because the structure is fixed. Error-prone, because the language id must agree exactly with the scope names referenced by the syntax grammar, the activation events in package.json, the vscode.languages.registerXxx calls in the extension entry. Disconnected, because the metamodel — which knows the file extensions are .req.ts, knows the language is named "Requirements", knows the comments are // and /* */ because the host language is TypeScript — has all the information already. The Language micro-DSL re-uses that information through the kernel and emits the declarations.

The Surface

import { Language } from '@frenchexdev/ide-dsl-language';

@Language({
  id: 'requirements',
  displayName: 'Requirements',
  scopeName: 'source.ts.requirements',
  extensions: ['.req.ts'],
  aliases: ['Requirements', 'req'],
  configuration: {
    comments: { line: '//', block: ['/*', '*/'] },
    brackets: [['{', '}'], ['[', ']'], ['(', ')']],
    autoClosingPairs: [
      { open: '{', close: '}' },
      { open: '[', close: ']' },
      { open: '(', close: ')' },
      { open: '"', close: '"', notIn: ['string'] },
      { open: "'", close: "'", notIn: ['string', 'comment'] },
    ],
    surroundingPairs: [['{', '}'], ['[', ']'], ['(', ')'], ['"', '"'], ["'", "'"]],
    indentation: { increaseIndentPattern: '^.*\\{[^}]*$', decreaseIndentPattern: '^\\s*\\}' },
  },
})
export class RequirementsLanguage {}

One decorator, one class, one object literal. The class has no fields, no methods — it is a marker. Everything the Language micro-DSL needs to know is in the decorator options. The decorator is pure metadata; calling new RequirementsLanguage() is legal but vacuous. The information lives in the registry the decorator populates at module-load time, picked up by the micro-DSL's extractor at build time.

The single sub-detail worth noting: scopeName follows the TextMate convention (source.<host>.<language>). The Syntax micro-DSL (article 07) emits a TextMate grammar whose root scope must match this name; cross-checking is part of the suite's integration tests. Choosing source.ts.requirements documents that the requirements DSL is hosted inside TypeScript — its files are valid .ts files — so editor injections can reference TypeScript scopes.

Kernel boundary

What this micro-DSL takes from the kernel:

  • One root @Concept declaration, marked as the language root by convention (the Concept's id is referenced by other micro-DSLs to scope their own contributions).
  • Nothing else. The Language micro-DSL does not consult the AST, the PatchBus, the EditLog, or the Banner.

What this micro-DSL gives back to the kernel:

  • The language identity is registered in the kernel's typed LanguageRegistry, an in-memory map exposed for read by every other micro-DSL. When the Snippets micro-DSL declares snippet scope requirements, it consults this registry to validate that the scope corresponds to a registered language. When the Completion micro-DSL filters its suggestions to "language === requirements", the registry resolves the predicate.

The boundary is deliberately one-way: Language publishes identity; everyone else reads it. Nothing the kernel does mutates a @Language declaration; the micro-DSL's contribution is purely additive. The reciprocal: the Language micro-DSL never reads any other micro-DSL's contributions. It does not know what tokens Syntax declared, what snippets Snippets emitted, what commands Commands registered. It declares identity; identity is downstream-consumable; that is the entire interaction.

Emitted artefacts

Two outputs:

// package.json (excerpt, generated)
{
  "contributes": {
    "languages": [
      {
        "id": "requirements",
        "aliases": ["Requirements", "req"],
        "extensions": [".req.ts"],
        "configuration": "./language-configuration.requirements.json"
      }
    ]
  }
}
// language-configuration.requirements.json (generated)
{
  "comments": { "lineComment": "//", "blockComment": ["/*", "*/"] },
  "brackets": [["{", "}"], ["[", "]"], ["(", ")"]],
  "autoClosingPairs": [
    { "open": "{", "close": "}" },
    { "open": "[", "close": "]" },
    { "open": "(", "close": ")" },
    { "open": "\"", "close": "\"", "notIn": ["string"] },
    { "open": "'", "close": "'", "notIn": ["string", "comment"] }
  ],
  "surroundingPairs": [["{", "}"], ["[", "]"], ["(", ")"], ["\"", "\""], ["'", "'"]],
  "indentationRules": {
    "increaseIndentPattern": "^.*\\{[^}]*$",
    "decreaseIndentPattern": "^\\s*\\}"
  }
}

Both files carry a Banner (article 05) so regeneration is idempotent. The package.json change is a structural merge — the Language micro-DSL's output is a JSON fragment that the Extension host (the suite-level packager) merges into the project's package.json contributes.languages array. The language-configuration.*.json file lands at the path declared in the merged configuration field. Both outputs are kernel-Banner-aware, with hash-based no-op detection on regeneration.

Composition with peers

Read by:

  • Syntax (article 07) — references scopeName to scope its emitted TextMate grammar; references id to populate contributes.grammars.
  • Snippets (article 09) — references id to scope snippet entries (scope: 'requirements').
  • Commands (article 13) — when clauses reference the language id (when: 'editorLangId == requirements').
  • Views (article 14) — view contributions can filter by language id when listing files.

Read-by but not modified by, in every case. The Language micro-DSL is a publisher with N subscribers, all coordinated through the kernel's LanguageRegistry. The publisher does not know the subscribers exist; the subscribers do not know each other exists.

MPS aspect referent

There is no direct MPS aspect for "language identity" — MPS does not produce a VSCode-shaped artefact and does not need to declare scope names against TextMate. The closest equivalent is the Module Descriptor of an MPS language module, where the module name, the file extension association (when MPS imports/exports text), and the identity for cross-language references are declared. We adopt the one place where identity lives shape and translate it to the VSCode contribution model.

Boundary justification

Two boundaries to defend:

Why is identity not in the kernel? Because the kernel must not import VSCode-shaped types (article 02 criterion 3). Language identity, as declared here, includes scopeName, extensions, aliases, autoClosingPairs — all of these are VSCode-shaped concepts. Putting them in the kernel would either pull vscode types in (forbidden) or invent kernel-shaped equivalents that translate to VSCode in a host (overkill: the data is small, the translation is mechanical). The right home is a thin VSCode-aware micro-DSL.

Why is identity not in Syntax? Because the natural unit of evolution is different. The language identity changes once: a language has one id, one set of file extensions, one set of bracket pairs. The syntax grammar changes often: new tokens, new scopes, new injection rules. Coupling identity to syntax means every grammar change forces a re-declaration of the identity (and risks accidental id changes). Splitting them lets the identity stabilise after the first declaration while syntax evolves freely.

Requirements

This micro-DSL's contract is verified by FEAT-MICRODSL-06 in assets/features.ts:

  • surfaceDecoratorsListed — the Surface section shows the single @Language decorator with all option fields named.
  • kernelConceptsConsumedNamed — the Kernel boundary section names the one read (root @Concept) and the publication path (LanguageRegistry).
  • emittedArtefactsEnumerated — the Emitted artefacts section shows both output files with their structural shape.
  • boundaryAgainstSyntaxJustified — the Boundary justification section names the unit-of-evolution argument.

Article 07 picks up with the Syntax micro-DSL, which reads scopeName from this one and emits the TextMate grammar that VSCode applies for highlighting.

⬇ Download