Phase 1 — Inferring Transitions from Source Code

Parts II through VII built the type system, the decorator, the machine catalog, the adapter layer, the topology map, and the drift scanner. All of those assume one thing: that the @FiniteStateMachine decorator on each machine has an explicit transitions array. But what happens when it does not?

The transitions field is optional. A developer can write a perfectly valid decorator with states, events, emits, listens, and a description — and leave transitions empty. The machine still works. The type system still checks the bus. The decorator still registers metadata. But the build-time extractor that produces state-machines.json has no state graph for that machine. The interactive explorer cannot render a statechart. The topology scanner cannot verify that every declared event triggers a known transition. The machine becomes a black box — documented in name but opaque in structure.

Phase 1 of the three-phase build pipeline exists to fill that gap. It reads the source code of every machine file, infers the transition graph using three complementary strategies, selects the best result, and patches the decorator with an explicit transitions array — all before the extractor (Phase 2) or the renderer (Phase 3) ever runs.

This part documents the three strategies, the priority selection algorithm, the code generator, and the source-preserving patcher. Every function described here lives in a single module: scripts/lib/fsm-transition-inferrer.ts — 309 lines of pure, side-effect-free code that takes strings and ASTs as input and returns strings and data structures as output. No filesystem. No process.argv. No side effects.

Why Infer Transitions?

Consider the @FiniteStateMachine decorator on page-load-state.ts as it exists today:

@FiniteStateMachine({
  states: ['idle', 'loading', 'rendering', 'postProcessing', 'done', 'error'] as const,
  events: ['startLoad', 'markRendering', 'markPostProcessing', 'markDone', 'markError'] as const,
  description: 'Tracks SPA page-load lifecycle from fetch to post-processing.',
  transitions: [
    { from: 'idle',           to: 'loading',        on: 'startLoad' },
    { from: 'loading',        to: 'rendering',      on: 'markRendering' },
    { from: 'rendering',      to: 'postProcessing', on: 'markPostProcessing' },
    { from: 'postProcessing', to: 'done',           on: 'markDone' },
    { from: '*',              to: 'error',          on: 'markError' },
  ] as const,
  emits:   ['app-ready', 'toc-headings-rendered'] as const,
  listens: [] as const,
  guards:  ['staleGeneration'] as const,
  feature: { id: 'PAGE-LOAD', ac: 'fullLifecycle' } as const,
})
export class PageLoadStateFsm {}

That transitions array is explicit, complete, and hand-written. But it was not always there. The first version of this decorator had no transitions at all. The developer wrote the machine, documented the states in a JSDoc diagram at the top of the file, implemented the transition logic in the factory function, and moved on. The transitions existed in three places — the JSDoc comment, the canTransition function (or equivalent guards), and the state = 'literal' assignments inside each method — but not in the decorator.

The inference engine reads all three places and produces the transitions array automatically. The developer then reviews the diff and commits the patched decorator. After that, the explicit array is the source of truth and inference is skipped for that machine.

This matters for three downstream consumers:

The extractor (Phase 2) reads transitions from the decorator AST. Without them, the machine's node in state-machines.json has an empty edge list.
The interactive explorer (Phase 3) renders each machine as a statechart. Without transitions, the statechart is a set of disconnected state boxes — useless.
The topology scanner (Part VII) uses transitions to verify that declared events correspond to actual state changes. Without them, the scanner can only check emits/listens, not the internal graph.

The inference pipeline is a bridge. It connects the developer's existing documentation and logic (JSDoc, canTransition, assignments) to the decorator's structured metadata format. Once the bridge has been crossed — once the developer has reviewed and committed the inferred transitions — it is no longer needed for that machine.

The Three Strategies

The inferrer uses three strategies to extract transitions from source code. Each strategy reads a different representation of the state graph:

Priority	Strategy	What It Reads	Confidence	Weakness
1 (highest)	JSDoc diagram	Unicode arrows in header comments	High — the developer drew the graph	May omit edge cases; label arrows produce `on='?'`
2	canTransition switch	`case 'idle': return to === 'copying'`	High — the explicit state graph function	Does not know which method triggers each transition
3 (lowest)	AST walking	`state = 'literal'` assignments	Medium — finds targets but not always sources	`from='*'` when no guard detected

The strategies are not merged. The priority picker selects one winning strategy and discards the rest. This is deliberate: mixing strategies would produce duplicates, contradictions, and transitions with mixed confidence levels. A clean selection — winner takes all — produces a consistent, reviewable result.

There is one exception. When the JSDoc strategy produces transitions with bare arrows (on='?' — the developer drew idle -> copying without naming the method), the picker attempts to resolve the unknown method name using AST entries. This is a controlled merge: it only adds information (the method name) to an existing JSDoc transition. It never adds new transitions from the AST strategy.

Similarly, when the canTransition strategy produces (from, to) pairs without method names, those pairs are combined with AST entries to find the triggering method. Again, this is resolution, not merging — the canTransition pair provides the from and to, the AST entry provides the on.

The rest of this part examines each strategy in detail, starting with the highest priority.

Strategy 1: JSDoc Diagram Parsing

Many machine files in this codebase begin with a JSDoc comment that draws the state graph using Unicode box-drawing characters and arrows. Here is the real diagram from page-load-state.ts:

/**
 * Page Load State Machine — pure, event-driven, testable.
 * No DOM, no fetch, no history — those are injected via callbacks.
 *
 * Uses a generation counter to detect stale loads when rapid navigation
 * causes concurrent fetches.
 *
 * States:
 *   idle ──startLoad()──→ loading ──markRendering()──→ rendering
 *                           │             │
 *                           │             └──markPostProcessing()──→ postProcessing ──markDone()──→ done
 *                           │
 *                           └──markError()──→ error
 *
 * Any method called with a stale generation id is a no-op (returns false).
 * Starting a new load while one is in-flight bumps the generation, making
 * the previous load stale.
 */

This is not a formal notation. There is no grammar. There is no parser specification. It is a convention that developers in this codebase follow because it is visually clear and requires no tooling to write. The inference engine reads it with a regex.

The parseJsDocDiagram Function

export function parseJsDocDiagram(
  text:   string,
  states: ReadonlySet<string>,
): InferredTransition[] {

The function takes the full file text (not just the comment — the regex is liberal enough to match anywhere) and a set of known state names. It returns an array of InferredTransition objects, each tagged with source: 'jsdoc'.

Three regex passes handle three arrow styles:

Pass 1 — Method arrows. The most common style. A state name, followed by dashes or box-drawing characters, a method name (with or without parentheses), more dashes, and a target state name.

idle ──copy()──> copying       method with parens
idle ──copy──→ copying         method without parens
idle ──copy()──▶ copying       black-triangle arrow head
idle --copy()--> copying       hyphen variant

The regex:

(\w[\w-]*)\s*[─\-]{1,4}(\w+)(?:\([^)]*\))?[─\-→>▶]{1,4}\s*([\w-]+)

Breaking it down:

Fragment	Matches
`(\w[\w-]*)`	Source state: one or more word chars or hyphens
`\s*`	Optional whitespace
`[─\-]{1,4}`	1-4 dashes (unicode em-dash or ASCII hyphen)
`(\w+)`	Method name
`(?:\([^)]*\))?`	Optional parentheses with any content
`[─\-→>▶]{1,4}`	Arrow head: dashes, right arrow, greater-than, or black triangle
`\s*`	Optional whitespace
`([\w-]+)`	Target state

The match produces three groups: from, method, to. If both from and to exist in the known state set, the transition is recorded with on: method.

Pass 2 — Label arrows. Sometimes the developer writes a descriptive label instead of a method name, enclosed in parentheses:

typing ──(tick per char)──▶ idle
success ──(timeout)──> idle

The label regex captures from and to but discards the label text. The transition is recorded with on: '?' — a placeholder that the priority picker may later resolve using AST entries.

(\w[\w-]*)\s*[─\-]{1,4}\([^)]+\)[─\-→>▶]{1,4}\s*([\w-]+)

This pass runs after Pass 1. If a label arrow matches the same (from, to) pair that Pass 1 already found with a method name, the label arrow is suppressed — the method-arrow version wins.

Pass 3 — Bare arrows. The simplest form: a state name, an arrow, a state name. No method, no label, no dashes.

idle → copying
idle > copying
idle ──▶ copying

The bare regex:

(\w[\w-]*)\s+[─\-]{0,2}[→>▶]\s+([\w-]+)

Bare arrows always produce on: '?'. Like label arrows, they are suppressed if a more specific arrow already covers the same pair.

Concrete Example: page-load-state.ts

Given the JSDoc diagram at the top of page-load-state.ts and the known states {idle, loading, rendering, postProcessing, done, error}, the three passes produce:

Pass	From	To	On	Source
1	idle	loading	startLoad	jsdoc
1	loading	rendering	markRendering	jsdoc
1	rendering	postProcessing	markPostProcessing	jsdoc
1	postProcessing	done	markDone	jsdoc
1	loading	error	markError	jsdoc

All five transitions are extracted with method names. No label arrows, no bare arrows needed. The result is complete enough that the priority picker will select it without consulting the other strategies.

Concrete Example: copy-feedback-state.ts

The JSDoc diagram for the copy feedback machine:

*   idle ──copy()──> copying ──succeed()──> success ──(timeout)──> idle
*                       │
*                       └────fail()──────> error ────(timeout)──> idle

The passes produce:

Pass	From	To	On
1	idle	copying	copy
1	copying	success	succeed
1	copying	error	fail
2	success	idle	?
2	error	idle	?

Two transitions have on: '?' because (timeout) is a label, not a method name. The priority picker will attempt to resolve these using AST entries. If the AST contains a method named reset that assigns state = 'idle', the ? entries will be resolved to on: 'reset'.

Deduplication

The dedup helper ensures that identical (from, to, on) triples are not emitted twice. This can happen when a JSDoc diagram repeats a transition across multiple lines (e.g., the developer drew the same arrow twice for visual clarity):

function dedup(transitions: InferredTransition[]): InferredTransition[] {
  const seen = new Set<string>();
  return transitions.filter(r => {
    const k = `${r.from}→${r.to}→${r.on}`;
    if (seen.has(k)) return false;
    seen.add(k);
    return true;
  });
}

The key uses the arrow character to avoid collisions — idle→copying→copy is a single unique transition regardless of how many times the regex matches it.

State Filtering

Both from and to must exist in the known state set. This prevents the regex from matching incidental words in the comment text. For instance, in the sentence "the machine transitions from ready to active mode," the word "ready" would not be captured unless ready is a declared state. Without this filter, the regex would produce phantom transitions from natural-language descriptions.

Strategy 2: canTransition Switch Analysis

Some machines in this codebase implement an explicit canTransition function that encodes the complete state graph as a switch statement. Here is the real function from copy-feedback-state.ts:

function canTransition(from: CopyFeedbackState, to: CopyFeedbackState): boolean {
  switch (from) {
    case 'idle':    return to === 'copying';
    case 'copying': return to === 'success' || to === 'error';
    case 'success': return to === 'idle' || to === 'copying';
    case 'error':   return to === 'idle' || to === 'copying';
  }
}

This is a data source that the JSDoc regex cannot read. The JSDoc diagram for this machine only has five arrows (three methods + two (timeout) labels), but the canTransition switch reveals seven edges — including success → copying and error → copying (the re-entry transitions for rapid clicks).

The parseCanTransition Function

export function parseCanTransition(
  sf:     ts.SourceFile,
  states: ReadonlySet<string>,
): Array<{ from: string; to: string }> {

This function uses the TypeScript Compiler API to walk the AST. It does not use regex. The input is a ts.SourceFile — a parsed TypeScript AST — not raw text. This is necessary because the switch statement has structure that regex cannot reliably parse (nested expressions, multi-line cases, comments between clauses).

The algorithm:

Find the function. Walk the AST looking for a function declaration or function expression named canTransition.
Find the switch. Inside the function body, find the switch statement (there should be exactly one; the convention in this codebase is a single switch on the from parameter).
Extract case labels. For each case clause, read the string literal. If the literal is not in the known state set, skip the clause.
Collect targets. Inside each clause's statements, find binary expressions of the form to === 'literal'. Each literal that exists in the known state set becomes a target. The || chaining is handled naturally by recursive AST walking — the binary expression to === 'success' || to === 'error' is a BinaryExpression with two children, each of which is also a BinaryExpression with ===.

The result is an array of { from, to } pairs — no on field. The canTransition function knows which states connect to which other states, but it does not know which method triggers each transition. That information must come from a different source.

Concrete Example: copy-feedback-state.ts

Given the canTransition function above and the known states {idle, copying, success, error}, the parser produces:

From	To
idle	copying
copying	success
copying	error
success	idle
success	copying
error	idle
error	copying

Seven pairs. But no method names. To produce complete InferredTransition entries, these pairs must be combined with AST entries.

Combining canTransition Pairs with AST Entries

The combineCanTransitionWithAst function takes the (from, to) pairs from the switch parser and the method-to-target entries from the AST walker, and produces full transitions:

export function combineCanTransitionWithAst(
  ctPairs:    ReadonlyArray<{ from: string; to: string }>,
  astEntries: ReadonlyArray<AstTransitionEntry>,
): InferredTransition[] {

For each (from, to) pair, the combiner looks for AST entries where the to matches. Three outcomes:

Exactly one AST entry reaches the target. Unambiguous: the method name from the AST entry becomes on. Example: { from: 'idle', to: 'copying' } + { method: 'copy', to: 'copying' } = { from: 'idle', to: 'copying', on: 'copy' }.
Multiple AST entries reach the target. Ambiguous: emit one transition per method. Example: if both start and restart can produce state running, the combiner emits two transitions for the same (from, to) pair — one for each method.
No AST entry reaches the target. Unresolvable: emit with on: '?'. This happens when the method that triggers the transition does not contain a direct state = 'literal' assignment (e.g., it calls a helper function, or uses a variable instead of a literal).

The combined result is tagged with source: 'canTransition' — the source of the structural information (the from and to) determines the strategy label, not the source of the method name.

Why canTransition Is High Priority

The canTransition function is the developer's explicit encoding of the state graph. It lists every valid transition, including edge cases that the JSDoc diagram might omit. In the copy-feedback example, the JSDoc diagram shows five arrows; the canTransition switch shows seven. The two extra transitions (success → copying and error → copying) are the re-entry paths for rapid clicks — important for correctness but easy to omit from a hand-drawn diagram.

That said, the current implementation gives canTransition the highest priority in selectBestTransitions. When canTransition results are available, they win. When they are not (because the machine does not have a canTransition function), the picker falls through to JSDoc.

Strategy 3: AST Inference — Walking State Assignments

The fallback strategy. When a machine has no JSDoc diagram and no canTransition function, the inferrer walks the TypeScript AST looking for assignments to state variables.

What It Finds

The AST walker looks for patterns like:

function startLoad(): number {
  const gen = ++generation;
  transition('loading');       // calls a helper that sets state
  return gen;
}

Or more directly:

function markRendering(gen: number): boolean {
  if (checkStale(gen)) return false;
  if (state !== 'loading') return false;
  state = 'rendering';        // direct assignment
  return true;
}

In both cases, the walker records:

{ method: 'markRendering', from: '*', to: 'rendering' }

The from is '*' — a wildcard — because the walker cannot always determine the source state. It knows that markRendering produces state 'rendering', but it does not always know which state the machine must be in for that method to succeed.

Guard Detection

When the AST walker encounters a guard condition before the assignment, it can sometimes infer the from state:

function markRendering(gen: number): boolean {
  if (checkStale(gen)) return false;
  if (state !== 'loading') return false;  // guard: state must be 'loading'
  transition('rendering');
  return true;
}

The pattern if (state !== 'literal') return false or if (state === 'literal') tells the walker that this method is only valid when the machine is in a specific state. When detected, the walker records from: 'loading' instead of from: '*'.

Not all guards are this simple. Some methods check multiple conditions, call helper functions, or use destructured state. The walker does not attempt to resolve complex guards — it falls back to from: '*' whenever the pattern does not match the simple state === 'literal' or state !== 'literal' form.

The AstTransitionEntry Type

export interface AstTransitionEntry {
  method: string;
  from:   string;
  to:     string;
}

This is the intermediate type — not yet an InferredTransition. The difference: AstTransitionEntry does not have a source tag. It is converted to InferredTransition by the CLI shell that orchestrates the strategies:

const astTransitions: InferredTransition[] = astEntries.map(t => ({
  from: t.from, to: t.to, on: t.method, source: 'ast' as const,
}));

Why AST Is the Lowest Priority

AST inference has two structural weaknesses:

Wildcard from states. When the walker cannot detect a guard, it records from: '*'. A transition from * is less useful than a transition from a specific state — it tells the explorer "this method can produce state X" but not "this method transitions from state Y to state X." The priority picker filters out from: '*' entries as a last resort.
Indirect assignments. Many machines in this codebase do not assign to state directly. They call a transition() helper function that performs the assignment, the state change callback, and the guard check. The walker can follow one level of indirection (recognizing transition('rendering') as a state assignment) but not arbitrary call chains.

Despite these weaknesses, AST inference is valuable as a fallback. It can produce partial graphs where the other strategies produce nothing. And even with wildcard from states, the entries provide method-to-target mappings that the canTransition combiner uses to resolve method names.

selectBestTransitions — The Priority Picker

The three strategies produce three arrays of InferredTransition. The picker selects one:

export function selectBestTransitions(
  jsdoc:         InferredTransition[],
  canTransition: InferredTransition[],
  ast:           InferredTransition[],
): InferredTransition[] {
  if (canTransition.length > 0) return canTransition;

  const jsdocKnown = jsdoc.filter(t => t.on !== '?');
  const jsdocBare  = jsdoc.filter(t => t.on === '?');

  // Try to resolve bare-arrow jsdoc pairs with AST method names.
  const resolved: InferredTransition[] = jsdocBare.length > 0 && ast.length > 0
    ? combineCanTransitionWithAst(
        jsdocBare.map(t => ({ from: t.from, to: t.to })),
        ast.map(t => ({ method: t.on, from: t.from, to: t.to })),
      ).filter(t => t.on !== '?')
    : [];

  const combined = dedup([...jsdocKnown, ...resolved]);
  if (combined.length > 0) return combined;

  return ast.filter(t => t.from !== '*');
}

The logic, step by step:

Tier 1: canTransition Wins

If the canTransition parser produced any results, return them immediately. No further consideration. This is the highest-confidence source — the developer wrote an explicit state graph function.

Tier 2: JSDoc With Resolution

If canTransition is empty, check JSDoc. Split the JSDoc results into two groups:

Known — transitions where on is a method name (not '?').
Bare — transitions where on is '?' (bare arrows or label arrows without a method name).

For the bare group, attempt resolution: use the same combineCanTransitionWithAst function (reused across strategies) to find AST entries that reach the same target state. If found, replace on: '?' with the actual method name.

Merge the known and resolved groups. If the result is non-empty, return it.

Tier 3: AST Fallback

If both canTransition and JSDoc produced nothing usable, fall back to AST entries. Filter out from: '*' wildcard entries — they are too imprecise to be useful as standalone transitions. Return whatever remains.

Why Winner Takes All

The picker never merges Strategy 1 and Strategy 2, or Strategy 2 and Strategy 3. Each strategy has its own assumptions and its own failure modes. Merging would produce contradictions — for instance, the JSDoc diagram might show 5 transitions while canTransition shows 7, and the 2 extra transitions from canTransition might have on: '?' because the combiner could not resolve the method name. Adding 5 fully-resolved JSDoc entries and 2 unresolved canTransition entries to the same array would be confusing to review.

Winner-takes-all means the developer sees a consistent set from a single source. If the result is incomplete, the developer can write a better JSDoc diagram or add a canTransition function — knowing exactly which strategy the inferrer will use next time.

The one controlled exception — resolving JSDoc bare arrows with AST data — is not a merge. It is gap-filling within a single strategy's output. The structural information (from, to) still comes from JSDoc. Only the method name comes from AST.

renderTransitionsArray — The Code Generator

Once the best transitions are selected, they need to become TypeScript source code. The renderTransitionsArray function produces the inner lines of a transitions: [...] property:

export function renderTransitionsArray(transitions: InferredTransition[]): string {
  return transitions
    .filter(t => t.on !== '?')
    .map(t => `    { from: '${t.from}', to: '${t.to}', on: '${t.on}' },`)
    .join('\n');
}

Three things to note:

Unresolved transitions are dropped. Any transition with on: '?' is silently removed. The developer should see only actionable, complete transitions in the decorator. An on: '?' entry in the decorator would cause a type error (the extractor expects a string literal, not '?'), and it conveys no useful information.
Consistent formatting. Four-space indent, single quotes, trailing comma. This matches the hand-written style in existing decorators across the codebase.
No as const in the line items. The as const assertion is applied to the array as a whole by the patcher, not to individual entries. The generated lines are plain object literals.

Example Output

Given these transitions:

[
  { from: 'idle', to: 'loading', on: 'startLoad', source: 'jsdoc' },
  { from: 'loading', to: 'rendering', on: 'markRendering', source: 'jsdoc' },
  { from: 'rendering', to: 'postProcessing', on: 'markPostProcessing', source: 'jsdoc' },
  { from: 'postProcessing', to: 'done', on: 'markDone', source: 'jsdoc' },
  { from: 'loading', to: 'error', on: 'markError', source: 'jsdoc' },
]

The function produces:

    { from: 'idle', to: 'loading', on: 'startLoad' },
    { from: 'loading', to: 'rendering', on: 'markRendering' },
    { from: 'rendering', to: 'postProcessing', on: 'markPostProcessing' },
    { from: 'postProcessing', to: 'done', on: 'markDone' },
    { from: 'loading', to: 'error', on: 'markError' },

Ready to be placed inside the transitions: [...] block by the patcher.

patchDecorator — Rewriting the Decorator In-Place

The patcher takes the original source file text and the inferred transitions, and produces a new source file text with the transitions property inserted into (or replaced within) the @FiniteStateMachine decorator.

export function patchDecorator(src: string, transitions: InferredTransition[]): string {
  const body = renderTransitionsArray(transitions);
  if (!body.trim()) return src;

  const transitionsBlock = `  transitions: [\n${body}\n  ] as const,`;

  const decStart = src.indexOf('@FiniteStateMachine({');
  if (decStart === -1) return src;

  // Walk to the matching closing `}` of the decorator argument object.
  let depth = 0;
  let i = decStart + '@FiniteStateMachine('.length;
  let decEnd = -1;
  while (i < src.length) {
    if (src[i] === '{') depth++;
    else if (src[i] === '}') {
      depth--;
      if (depth === 0) { decEnd = i; break; }
    }
    i++;
  }
  if (decEnd === -1) return src;

The algorithm:

Render the transitions block. If all transitions have on: '?', the body is empty and the function returns the original source unchanged — no patch needed.
Find the decorator. Search for the literal string @FiniteStateMachine({. If not found, return unchanged. This handles files that do not have a decorator (e.g., pure utility modules).
Find the matching closing brace. A simple depth counter walks from the opening { to the matching }. This handles nested objects (like feature: { id: 'X', ac: 'Y' }) correctly — the counter increments on every { and decrements on every }, stopping when depth returns to zero.
Remove existing transitions. If the decorator already has a transitions property, it is stripped out with a regex before the new one is inserted. This handles replacement as well as insertion.
Insert the new block. The transitions block is appended after the last existing property, with proper comma handling. If the last property ends with a comma, the block is appended directly. If not, a comma is added first.
Reassemble. The source is sliced into three parts (before decorator inner, new inner, after decorator), concatenated, and returned.

Source Preservation

The patcher does not reformat the decorator. It does not reorder properties. It does not change indentation. It does not touch anything outside the @FiniteStateMachine({ ... }) call. The only change is the addition (or replacement) of the transitions property.

This matters for code review. When a developer runs infer-fsm-transitions --patch and then git diff, they should see exactly one change per machine: the new transitions block. If the patcher reformatted the entire decorator, the diff would be noisy and hard to review.

No-Op Safety

The function returns the original source unchanged in three cases:

All transitions have on: '?' — the body is empty after filtering.
The @FiniteStateMachine({ string is not found — no decorator to patch.
The matching } is not found — malformed decorator (should not happen in practice).

This makes the patcher safe to run on every file in src/lib/ without worrying about accidental modifications.

Before/After: copy-feedback-state.ts

Before — the decorator without explicit transitions:

@FiniteStateMachine({
  states: ['idle', 'copying', 'success', 'error'] as const,
  events: ['copy', 'succeed', 'fail', 'reset'] as const,
  description: 'Tracks the lifecycle of a clipboard copy operation for UI feedback.',
  emits:   [] as const,
  listens: [] as const,
  guards:  [] as const,
  feature: { id: 'COPY', ac: 'codeCopyShowsCheckmark' } as const,
})
export class CopyFeedbackStateFsm {}

After — patched with inferred transitions from the canTransition switch:

@FiniteStateMachine({
  states: ['idle', 'copying', 'success', 'error'] as const,
  events: ['copy', 'succeed', 'fail', 'reset'] as const,
  description: 'Tracks the lifecycle of a clipboard copy operation for UI feedback.',
  emits:   [] as const,
  listens: [] as const,
  guards:  [] as const,
  feature: { id: 'COPY', ac: 'codeCopyShowsCheckmark' } as const,
  transitions: [
    { from: 'idle', to: 'copying', on: 'copy' },
    { from: 'copying', to: 'success', on: 'succeed' },
    { from: 'copying', to: 'error', on: 'fail' },
    { from: 'success', to: 'idle', on: 'reset' },
    { from: 'success', to: 'copying', on: 'copy' },
    { from: 'error', to: 'idle', on: 'reset' },
    { from: 'error', to: 'copying', on: 'copy' },
  ] as const,
})
export class CopyFeedbackStateFsm {}

The diff is exactly one block: the transitions property. Every other line in the decorator is untouched.

Why AST Instead of Regex

The first version of the canTransition parser was regex-based. It lasted two days before three failures killed it.

Failure 1: Nested Objects in Transition Entries

When the extractor reads back the patched decorator, the transitions array can contain entries with nested objects — for instance, a when guard field:

transitions: [
  { from: 'idle', to: 'loading', on: 'startLoad' },
  { from: 'loading', to: 'error', on: 'markError', when: { guard: 'notStale' } },
] as const,

A regex looking for { from: '...' } entries would break on the nested { guard: 'notStale' } — it would match the inner closing brace as the end of the entry, producing a malformed parse.

The AST parser handles this naturally because the TypeScript compiler resolves brace matching as part of parsing. The AST node for the transition entry contains the nested object as a child property, not as raw text.

Failure 2: `as const` on the Array

The as const assertion after the array closing bracket is critical — it preserves literal types so the extractor can read 'idle' instead of string. A regex that looks for transitions: [...] would need to handle the optional as const suffix, the optional trailing comma, and the optional whitespace. Each optional element doubles the regex complexity.

In the AST, as const is a TypeAssertion node wrapping the array literal. The parser does not need to know it is there — it reads the array literal contents regardless of the assertion.

Failure 3: Multiline Decorators Spanning 20+ Lines

The @FiniteStateMachine decorator on a machine like page-load-state.ts spans 20 lines. A regex matching the full decorator needs to handle newlines, varying indentation, interleaved comments, and properties in any order. The patchDecorator function avoids this by using a simple brace-depth counter instead of a regex for the outer structure, and a targeted regex only for the transitions: [...] block removal.

The brace-depth approach is not a full parser — it does not handle braces inside string literals or template literals. But in practice, @FiniteStateMachine decorators in this codebase contain only single-quoted string literals (no braces) and object literals (which the counter handles). The approach has patched 43 decorators without a single misparse.

The Lesson

Regex is the right tool for reading JSDoc diagrams — those are unstructured text with a loose convention. AST is the right tool for reading TypeScript syntax — that has a formal grammar and a production-quality parser (the TypeScript compiler) available as a library. Using the wrong tool for the wrong structure is a common source of brittle, unmaintainable code.

The fsm-transition-inferrer.ts Module

The entire inference engine lives in one file: scripts/lib/fsm-transition-inferrer.ts. Here is the shape:

fsm-transition-inferrer.ts — 309 lines
├── Types
│   ├── InferredTransition { from, to, on, source }
│   └── AstTransitionEntry { method, from, to }
├── Strategy 1: parseJsDocDiagram(text, states)
├── Strategy 2: parseCanTransition(sourceFile, states)
├── Combiner: combineCanTransitionWithAst(pairs, entries)
├── Priority: selectBestTransitions(jsdoc, ct, ast)
├── Generator: renderTransitionsArray(transitions)
├── Patcher: patchDecorator(src, transitions)
└── Helpers: dedup, dedupPairs

Seven exported functions. Two types. Two private helpers. Every function is pure — takes input, returns output, touches no global state, performs no side effects. This makes the module trivially unit-testable:

// From test/unit/fsm-transition-inferrer.test.ts

'extracts unicode em-dash arrow transitions'() {
  const text = 'idle ──copy()──> copying';
  const result = parseJsDocDiagram(text, states('idle', 'copying'));
  expect(result).toHaveLength(1);
  expect(result[0]).toMatchObject({
    from: 'idle', to: 'copying', on: 'copy', source: 'jsdoc'
  });
}

'extracts a single case with a single target'() {
  const src = `
    function canTransition(from: string, to: string): boolean {
      switch (from) {
        case 'idle': return to === 'copying';
      }
      return false;
    }
  `;
  const result = parseCanTransition(sf(src), states('idle', 'copying'));
  expect(result).toHaveLength(1);
  expect(result[0]).toEqual({ from: 'idle', to: 'copying' });
}

The test file passes literal source strings to parseSource (which creates a ts.SourceFile in memory) and literal text to parseJsDocDiagram. No filesystem mocking. No temp files. No cleanup. The tests are fast (sub-millisecond per case) and deterministic.

The CLI Shell

The pure library is wired to the filesystem by scripts/infer-fsm-transitions.ts — a CLI shell that:

Reads all .ts files from src/lib/.
Filters to machine files using filterMachineSources (files containing a @FiniteStateMachine decorator).
For each machine, extracts states from the decorator, runs all three strategies, picks the best, and reports.
In --patch mode, writes the patched source back to the file.

The shell is 153 lines, most of which is wiring and reporting. The pure library does the actual work.

Dry-Run Mode

Without --patch, the CLI outputs a JSON report to stdout:

[
  {
    "file": "copy-feedback-state.ts",
    "name": "CopyFeedbackStateFsm",
    "states": ["idle", "copying", "success", "error"],
    "existing": false,
    "inferred": [
      { "from": "idle", "to": "copying", "on": "copy", "source": "canTransition" },
      { "from": "copying", "to": "success", "on": "succeed", "source": "canTransition" }
    ],
    "coverage": "4/4 from-states covered"
  }
]

The developer can inspect the report, verify that the inferred transitions are correct, and then run --patch to apply them. The two-step workflow prevents accidental rewrites.

The Full Phase 1 Pipeline

Putting it all together — the complete flow from source files to patched decorators:

Load. Read all .ts files from src/lib/. Filter to machine files.
Extract states. For each machine, read the states array from the decorator AST.
Infer. Run all three strategies against the file text and AST.
Select. Pick the best transitions using the priority algorithm.
Render. Convert the selected transitions to a TypeScript code string.
Patch. Insert the rendered block into the decorator. Write the file.

Steps 1-2 use the extractor (which will be detailed in Part IX). Steps 3-6 use the inferrer. The CLI shell connects them.

After Phase 1 completes, every machine file in src/lib/ that has a @FiniteStateMachine decorator now has an explicit transitions array — either hand-written (the developer wrote it originally) or inferred (Phase 1 patched it). Phase 2 can now extract the full state graph with confidence.

Edge Cases and Limitations

Machines Without JSDoc or canTransition

Some machines have neither a JSDoc diagram nor a canTransition function. They rely entirely on AST inference. If the AST walker can detect guards (the if (state !== 'literal') return false pattern), the result has specific from states and is usable. If not, the result has wildcard from: '*' entries, which the priority picker filters out. These machines end up with an empty transitions array — and the CLI reports them with a warning:

  ! no transitions found: SomeMachineFsm

The developer's remedy: add a JSDoc diagram or a canTransition function. Both take less than a minute to write, and the next inference run will pick them up.

Re-entry Transitions

Machines that allow re-entry (transitioning from a terminal state back to an earlier state, like success → copying in the copy-feedback machine) are fully supported by all three strategies. The JSDoc parser matches arrows in any direction. The canTransition parser follows the switch cases wherever they lead. The AST walker records every assignment regardless of the source state.

Nested State Types

Some machines use union types for state:

type MyState = 'idle' | 'loading' | { kind: 'error'; code: number };

The inferrer does not handle object literal states. It only recognizes string literal states. Machines with complex state types must declare their transitions explicitly in the decorator.

Multiple canTransition Functions

If a file contains multiple canTransition functions (e.g., one for the machine and one for a nested sub-machine), the parser finds the first one and ignores subsequent ones. This is a known limitation — the convention in this codebase is one machine per file, so multiple canTransition functions should not occur.

What Comes Next

Phase 1 fills the gap: machines that lacked explicit transitions now have them. The decorator is the single source of truth. Phase 2 reads that truth — using the TypeScript Compiler API to walk every decorator, extract every property, and build the graph data structure that feeds the interactive explorer.

The inference engine is a bootstrap tool. Once all 43 machines have explicit transitions (as they do today), Phase 1 becomes a safety net — it runs on every build, verifies that no machine has regressed to an empty transitions array, and reports any gaps. The developer never has to write transitions by hand if they have already drawn the state graph in a JSDoc comment or implemented a canTransition function. The inferrer reads what the developer has already written and translates it into the structured format that the rest of the pipeline consumes.

Continue to Part IX: Phase 2 — Extraction via TypeScript Compiler API →

`[` or `Alt+S`	Focus sidebar navigation
`]` or `Alt+C`	Focus main content
`↑` `↓`	Navigate between sidebar items
`Enter`	Open page / toggle section
`Space`	Toggle section expand/collapse
`Escape`	Close overlay / sidebar

`Ctrl+K`	Open search
`?`	Show this help

`Ctrl+=` or `Ctrl+↑`	Increase font size
`Ctrl+−` or `Ctrl+↓`	Decrease font size
`f`	Open console font selector

`Ctrl+⇧+=` or `Ctrl+⇧+↑`	Browser zoom in
`Ctrl+⇧+−` or `Ctrl+⇧+↓`	Browser zoom out
`Ctrl+⇧+0`	Reset browser zoom

`Tab`	Focus a diagram or image
`Enter`	Open full size overlay
`+` `−`	Zoom in / out (in overlay)
`Escape`	Close overlay, return focus

Phase 1 — Inferring Transitions from Source Code📋

Why Infer Transitions?📋

The Three Strategies📋

Strategy 1: JSDoc Diagram Parsing📋

The parseJsDocDiagram Function📋

Concrete Example: page-load-state.ts📋

Concrete Example: copy-feedback-state.ts📋

Deduplication📋

State Filtering📋

Strategy 2: canTransition Switch Analysis📋

The parseCanTransition Function📋

Concrete Example: copy-feedback-state.ts📋

Combining canTransition Pairs with AST Entries📋

Why canTransition Is High Priority📋

Strategy 3: AST Inference — Walking State Assignments📋

What It Finds📋

Guard Detection📋

The AstTransitionEntry Type📋

Why AST Is the Lowest Priority📋

selectBestTransitions — The Priority Picker📋

Tier 1: canTransition Wins📋

Tier 2: JSDoc With Resolution📋

Tier 3: AST Fallback📋

Why Winner Takes All📋

renderTransitionsArray — The Code Generator📋

Example Output📋

patchDecorator — Rewriting the Decorator In-Place📋

Source Preservation📋

No-Op Safety📋

Before/After: copy-feedback-state.ts📋

Why AST Instead of Regex📋

Failure 1: Nested Objects in Transition Entries📋

Failure 2: as const on the Array📋

Failure 3: Multiline Decorators Spanning 20+ Lines📋

The Lesson📋

The fsm-transition-inferrer.ts Module📋

The CLI Shell📋

Dry-Run Mode📋

The Full Phase 1 Pipeline📋

Edge Cases and Limitations📋

Machines Without JSDoc or canTransition📋

Re-entry Transitions📋

Nested State Types📋

Multiple canTransition Functions📋

What Comes Next📋

Phase 1 — Inferring Transitions from Source Code

Why Infer Transitions?

The Three Strategies

Strategy 1: JSDoc Diagram Parsing

The parseJsDocDiagram Function

Concrete Example: page-load-state.ts

Concrete Example: copy-feedback-state.ts

Deduplication

State Filtering

Strategy 2: canTransition Switch Analysis

The parseCanTransition Function

Concrete Example: copy-feedback-state.ts

Combining canTransition Pairs with AST Entries

Why canTransition Is High Priority

Strategy 3: AST Inference — Walking State Assignments

What It Finds

Guard Detection

The AstTransitionEntry Type

Why AST Is the Lowest Priority

selectBestTransitions — The Priority Picker

Tier 1: canTransition Wins

Tier 2: JSDoc With Resolution

Tier 3: AST Fallback

Why Winner Takes All

renderTransitionsArray — The Code Generator

Example Output

patchDecorator — Rewriting the Decorator In-Place

Source Preservation

No-Op Safety

Before/After: copy-feedback-state.ts

Why AST Instead of Regex

Failure 1: Nested Objects in Transition Entries

Failure 2: `as const` on the Array

Failure 3: Multiline Decorators Spanning 20+ Lines

The Lesson

The fsm-transition-inferrer.ts Module

The CLI Shell

Dry-Run Mode

The Full Phase 1 Pipeline

Edge Cases and Limitations

Machines Without JSDoc or canTransition

Re-entry Transitions

Nested State Types

Multiple canTransition Functions

What Comes Next