Skip to main content
Welcome. This site supports keyboard navigation and screen readers. Press ? at any time for keyboard shortcuts. Press [ to focus the sidebar, ] to focus the content. High-contrast themes are available via the toolbar.
serard@dev00:~/cv

Part XI: The Final State

"We shall not cease from exploration, and the end of all our exploring will be to arrive where we started and know the place for the first time." -- T.S. Eliot, Four Quartets

Ten phases. Six months. Four teams. 500 new tests. Zero production outages during the migration.

We started with SubscriptionHub -- a 6-year-old SaaS billing platform that made money, shipped features, and resisted every change. 80,000 lines of C# across 7 projects. A BillingService with 2,400 lines and 12 dependencies. A Common project that weighed more than the domain. Three proration implementations, two of which were wrong, and nobody knew which one ran in production. Twelve tests, six decorated with [Ignore]. A flat AppDbContext with 31 entities and navigation properties that let you traverse from a Customer to a TaxRate to an InvoiceLineItem to an AnalyticsSnapshot in a single LINQ query that EF Core would cheerfully translate into a seven-table join.

That system is gone. Not rewritten. Migrated. Incrementally, under tests, without ever stopping production. No big-bang deploy. No "we're dark for two weeks while we switch over." No moment where every developer held their breath and watched the dashboards.

The migration happened in the margins. In the first hour of the morning, before stand-up. In the PR that adds a feature and extracts a Value Object. In the test someone wrote because they were tired of debugging the same proration edge case for the fourth time. In the Friday afternoon when a developer decided to move one more entity into the domain library "just to see how it feels." One .csproj at a time. One record at a time. One aggregate at a time.

Some phases took weeks. Some took days. The Value Object extraction (Part VIII) was the fastest -- two weeks for eight Value Objects, because the pattern is mechanical. The Aggregate extraction (Part IX) was the slowest -- six weeks for the Subscription aggregate alone, because it required the Strangler Fig, the feature flags, the parallel execution, and the confidence to cut over. The time was not wasted. The time was risk management.

This is what replaced it.


The Architecture: Before

This is the project dependency graph from Part I. Seven projects, circular dependencies, one AppDbContext that every team reads from, a Common project that everything depends on, and no boundary the compiler enforces.

Diagram

Every arrow is a hard dependency. Every dependency is a coupling vector. When SVC changes, everything downstream could break. When CMN changes, everything could break. When DATA changes, billing, notifications, analytics, and subscriptions all recompile and all risk regression.

There is no direction to the arrows. They form a web, not a tree. The architecture has no opinion about what depends on what, because the architecture was never designed -- it grew.

Count the paths from WEB to DATA. There are multiple, because there is no single route through the system. A request can go through SVC, or through CMN, or through NOTIF. Each path carries different assumptions about transaction boundaries, error handling, and data freshness. When something goes wrong, you trace through all of them, because you do not know which one executed.

Count the teams that touch CMN. All four. A change to a shared helper in Common triggers a rebuild of every project, a retest of every integration, and a prayer that the shared class's behavior has not subtly changed for any of its 47 consumers. The Common project is the central nervous system of the mud -- everything feels everything, and severing a single nerve is impossible without understanding all the connections.

And look at the SVC --> NOTIF arrow. Services depends on Notifications. But Notifications also depends on Data, which Services also depends on. This is a dependency loop through shared state. When BillingService dispatches a notification, it calls NotificationService, which queries AppDbContext for the same entities that BillingService just modified in the same transaction. Change tracking conflicts. Stale reads. "Works on my machine" bugs that only reproduce under load.

This is the architecture we inherited. This is the architecture we migrated away from.


The Architecture: After

Four bounded contexts. Clean dependency arrows pointing inward. A SharedKernel with six types. An event bus connecting contexts without coupling them. An API composition root that wires everything together.

Diagram

Notice what disappeared:

  • No Common project. The 47-class coupling magnet is gone. Six types survive in SharedKernel: Entity<TId>, ValueObject, AggregateRoot<TId>, DomainEvent, Result<T>, and IRepository<T>. The rest were absorbed into the contexts that owned them -- StringExtensions went to the context that used it, DateTimeHelpers became methods on SubscriptionPeriod, and 23 classes turned out to be unused entirely. Nobody noticed when they were deleted because nobody was calling them.
  • No SubscriptionHub.Services. The God Service layer is gone. Behavior lives in domain aggregates. Orchestration lives in application handlers. The 2,400-line BillingService became three focused handlers averaging 60 lines each. The 963-line SubscriptionService became an 80-line aggregate with invariants that enforce themselves.
  • No SubscriptionHub.Data. The flat 31-entity AppDbContext was replaced by four context-specific DbContexts. SubscriptionsDbContext owns 8 entities. BillingDbContext owns 10. NotificationsDbContext owns 5. AnalyticsDbContext owns 8. Same database, same tables, separate compilation boundaries. Each team can add a migration without touching another team's DbContext.
  • No direct cross-context queries. Notifications does not query SubscriptionsDbContext. It subscribes to SubscriptionCancelledEvent and maintains its own read models. Analytics does not run raw SQL against the replica. It consumes events and projects them into warehouse tables.

Notice what the arrows look like:

  • Infrastructure depends on Application and Domain -- never the reverse. You cannot accidentally import Microsoft.EntityFrameworkCore in a domain class because the domain project does not have the NuGet reference.
  • Domain depends only on SharedKernel -- zero NuGet packages, zero infrastructure types. You can dotnet test every domain class in under a second with nothing but xUnit and new.
  • Cross-context communication is via events -- the dashed lines. No context holds a reference to another context's domain or application layer. When Subscriptions needs to tell Billing that a plan changed, it publishes PlanChangedEvent. It does not call BillingService.RecalculateInvoice().
  • The API delegates, it does not orchestrate -- thin controllers call handlers. The controllers validate HTTP-level concerns (authentication, input format). The handlers validate application-level concerns (authorization, business preconditions). The aggregates enforce domain invariants. Each layer does its job and nothing more.

Side by Side: Old vs New

How the concepts map from the original structure to the new one, with team ownership shown:

Diagram

The transformation is not about writing better code inside the same structure. It is about changing the structure so that worse code cannot compile. When Subscriptions.Domain does not reference Billing.Domain, no developer -- no matter how tired, no matter how tight the deadline -- can accidentally import a billing concept into the subscription model. The compiler says no. The PR never reaches the reviewer.

The color-coding tells a story that the old architecture could not tell: team ownership is visible. In the "Before" diagram, every project is red because every team touches every project. In the "After" diagram, each context has a single color because each context has a single owner. Conway's Law is no longer being violated -- it is being leveraged. The communication structure of the organization (4 teams) matches the code structure (4 bounded contexts). Merge conflicts at the boundaries have dropped to near zero because the boundaries are team boundaries.


The Comparison Table

Here is the migration measured across twelve dimensions. Each row is a concrete metric, not an opinion.

Dimension Before After
Proration logic 3 implementations, 3 different results 1 method: SubscriptionPeriod.ProrateFraction()
Subscription validation SubscriptionService (963 lines), rules scattered across 4 methods Subscription aggregate (80 lines), invariants in the entity
BillingService 2,400 lines, 12 constructor dependencies Gone -- decomposed into GenerateInvoiceHandler, ProcessPaymentHandler, RetryDunningHandler
Common project 47 classes, every project depends on it Gone -- replaced by SharedKernel (6 types)
Domain library NuGet packages N/A (no domain library existed) 0 packages -- pure C#
DbContext 1 AppDbContext, 31 entities, 4 teams editing 4 context-specific DbContexts, each owned by 1 team
Unit tests 12 tests, 6 [Ignore]d 500+ domain tests, all green, all fast
Test execution time 45 seconds (DB + Stripe sandbox + SMTP) 800ms (pure domain logic, no I/O)
Deployment risk All-or-nothing -- one artifact, one pipeline Context-level deployment -- Billing ships independently
New developer onboarding "Read the whole codebase" (3 weeks) "Read your bounded context" (3 days)
Weekly merge conflicts 3+ conflicts/week in BillingService.cs, AppDbContext.cs Near zero -- each team owns their context exclusively
Cross-context coupling Navigation properties, direct DB queries, shared DTOs Domain events + ACLs. No shared mutable state

A few rows deserve deeper attention.

Proration was the original motivating pain. Three implementations in three services, producing three different results for the same input. One rounded to two decimal places. One rounded to four. One did not round at all. The customer who paid $12.33 on one code path and $12.3267 on another filed a support ticket that took three developers two days to trace. Now there is SubscriptionPeriod.ProrateFraction() -- a single method, on a single Value Object, with 14 tests covering every edge case including leap years, month boundaries, and annual plans. If someone tries to add a second implementation, the code review question is simple: "Why not use ProrateFraction()?" The answer is always: use ProrateFraction().

Test execution time dropped from 45 seconds to 800 milliseconds. That is not because we removed tests -- we added 488. It is because domain tests do not touch I/O. No database. No HTTP. No SMTP. No Stripe sandbox that times out on the CI runner. The old tests were slow because the old code was coupled to infrastructure. The new tests are fast because the new code is pure domain logic. A developer runs dotnet test after every change. It finishes before they switch windows. That feedback loop changes behavior -- developers test more because testing is free.

New developer onboarding shrunk from three weeks to three days. That number is approximate, but the mechanism is precise. A new developer on the Billing team reads Billing.Domain (8 files), Billing.Application (12 files), and Billing.Infrastructure (6 files). They do not need to read the Subscriptions context, the Notifications context, or the Analytics context. They do not need to understand the entire system. They need to understand their bounded context and the events it publishes and subscribes to. The domain model is the documentation. The tests are the specification. The project graph is the architecture diagram.

The numbers are not the point. The direction is the point. Every metric moved toward isolation, speed, and confidence. And they moved together, because they are correlated. Fast tests come from isolated domains. Isolated domains come from bounded contexts. Bounded contexts come from explicit boundaries. Explicit boundaries come from Event Storming. Event Storming comes from putting four teams in a room with sticky notes and a wall.

The chain is:

Discovery (Event Storming) leads to Structure (Bounded Context Libraries) leads to Isolation (ACLs, Value Objects, Aggregates) leads to Decoupling (Domain Events) leads to Confidence (500 green tests in 800ms) leads to Velocity (ship without fear).

Each link was one or two parts of this series. Each link delivered value independently. Each link made the next one easier.

What the Numbers Do Not Show

The table captures the quantifiable improvements. But some of the most important changes are qualitative, and they are the ones that matter most to the developers who live in the codebase every day.

The fear is gone. In the old system, deploying on Friday was a running joke -- except nobody laughed. A change to BillingService could affect proration, which could affect invoicing, which could affect notifications, which could affect analytics. The blast radius of any change was the entire system. Developers compensated by making smaller changes, reviewing more carefully, and deploying less frequently. The result was a slower feedback loop, longer PR queues, and features that sat in branches for weeks.

In the new system, a change to Billing.Domain affects Billing. Full stop. The Billing team can deploy on Friday because their deployment does not touch Subscriptions, Notifications, or Analytics. The domain tests run in under a second. The integration tests run in under 30 seconds. If something breaks, the blast radius is one bounded context, and the rollback is one artifact.

The vocabulary is shared. Before the migration, the Billing team and the Subscriptions team used the word "subscription" to mean different things. This caused bugs. A billing developer would fix a "subscription" issue in BillingService and introduce a regression for the Subscriptions team's "subscription" concept. After Event Storming, each context has its own model of "subscription." In the Billing context, it is BillingSubscription -- a billing schedule. In the Subscriptions context, it is Subscription -- a lifecycle state machine. Different types, different namespaces, different semantics. The compiler makes it impossible to confuse them.

This is the Ubiquitous Language in action. Not as a design principle that developers nod at and then ignore, but as a structural fact enforced by the type system. When the Product team writes a requirement that says "subscription upgrade," the developers know exactly which aggregate handles it (Subscription.Upgrade() in Subscriptions.Domain) and which events it publishes (PlanChangedEvent). The language in the requirement matches the language in the code. There is no translation layer. There is no "what they call X, we call Y." The model is the language.

The onboarding story changed. We used to tell new developers: "You'll need about three weeks to understand the codebase." Now we tell them: "Read the domain layer in your bounded context. Run the tests. Understand the aggregate. You'll be shipping PRs by Wednesday." The bounded context is the cognitive boundary. A new developer does not need to understand the entire system to be productive. They need to understand their context, the events it publishes, and the events it subscribes to. That is a day of reading, not three weeks.

This has a second-order effect on hiring. When onboarding takes three weeks, you are reluctant to hire junior developers -- the ramp-up cost is too high. When onboarding takes three days, junior developers become viable. You can hire for potential instead of experience, because the bounded context limits the blast radius of mistakes. A junior developer who breaks something in Billing.Domain does not break Subscriptions, Notifications, or Analytics. The domain tests catch the mistake immediately. The architecture protects the organization from the learning curve.

The code review quality improved. When a PR touches one bounded context, the reviewer can reason about it locally. "Does this invariant hold? Does this Value Object enforce its rules? Does this event carry the right data?" These are answerable questions. In the old system, a PR that touched BillingService required the reviewer to consider ripple effects across notifications, analytics, and subscriptions. That review took an hour and caught half the issues. Now it takes fifteen minutes and catches all of them, because the scope is bounded.

The debugging story simplified. When a production issue occurs in the Billing context, the investigation starts and ends in Billing.*. The logs are scoped to the context. The call stack is within the context. The database queries are in BillingDbContext. There is no "wait, which service calls which service calls which service?" tracing. The event bus provides a clear timeline: this event was published by Subscriptions at T1, received by Billing at T2, processed at T3. The causal chain is explicit because the events are explicit.

Compare this with the old debugging story. A proration bug is reported. You open BillingService.ProcessMonthlyBilling(). You see it calls SubscriptionService.GetCurrentPeriod(). You trace into that method. It loads from AppDbContext and calculates using inline logic. But wait -- there is also BillingHelpers.CalculateProration() in Common. And InvoiceService.ProratePriceForPeriod(). Which one is the caller using? You git blame the line. It was changed by someone on the Platform team three months ago. You ask them why. They do not remember. This is what debugging in the mud feels like. In the new system, proration lives in SubscriptionPeriod.ProrateFraction(). One file. One method. Fourteen tests. The investigation takes five minutes instead of five hours.

The architecture became testable. We added architecture tests using NetArchTest that verify the dependency rules on every build:

[Fact]
public void Domain_Should_Not_Reference_Infrastructure()
{
    var result = Types.InAssembly(typeof(Subscription).Assembly)
        .ShouldNot()
        .HaveDependencyOn("Subscriptions.Infrastructure")
        .GetResult();

    result.IsSuccessful.Should().BeTrue();
}

[Fact]
public void Domain_Should_Not_Reference_EfCore()
{
    var result = Types.InAssembly(typeof(Subscription).Assembly)
        .ShouldNot()
        .HaveDependencyOn("Microsoft.EntityFrameworkCore")
        .GetResult();

    result.IsSuccessful.Should().BeTrue();
}

These tests run in milliseconds and catch dependency violations before they reach code review. They are the second line of defense -- after the project graph and before the code review. In the old system, there was no automated way to detect that BillingService had imported a Notifications type. In the new system, it fails the build.

We have 12 architecture tests covering the dependency rules for all four bounded contexts. They run in the first 200ms of the test suite. They have caught 9 violations in the four months since they were introduced -- each one a developer who added a project reference that violated the onion architecture rules. Nine potential regressions prevented. Nine code review round-trips avoided. Nine conversations about architecture replaced by a red test with a clear message.


What We Wish We'd Known

Eleven lessons, extracted from six months of migration. These are not theoretical. They are scars.

1. Write characterization tests before moving a single line

This is non-negotiable. Before you touch any code, before you create any library, before you rename a single namespace, write tests that capture the system's actual behavior -- including its bugs.

We covered this in Part V, but it bears repeating because it is the lesson most teams skip. They skip it because writing tests for code you did not write is tedious. It is. But the alternative is refactoring without a safety net. You move CalculateProration() from BillingService to SubscriptionPeriod, and now proration is off by a cent for annual plans. You do not find out until a customer complains. You do not find out for three weeks. You roll back the entire migration because you cannot isolate the regression.

The golden master pattern is your friend. Capture the output of the God Service for known inputs. Store it as a snapshot. Assert against it after every refactoring step. If the snapshot changes and you did not intend it to change, you broke something. If the snapshot changes and you did intend it, update the snapshot and commit the new baseline.

A concrete tip: focus your characterization tests on the boundaries of the God Service -- its public methods. Do not try to test internal methods. Internal methods will change, move, or disappear during the migration. Public methods are the contract you are preserving. If every public method has a golden master snapshot, you can rearrange the internals with confidence.

Another tip: capture the bugs too. If the golden master shows that ProcessMonthlyBilling() produces an incorrect proration for annual plans with mid-cycle cancellation, capture that incorrect output as the snapshot. Do not fix the bug during the characterization test phase. Fix it later, in the Value Object extraction phase, when you have SubscriptionPeriod.ProrateFraction() with its own focused tests. The characterization test's job is to detect unintentional changes. Intentional changes come later, under TDD.

2. Create bounded context libraries before writing any domain code

The libraries must exist in the solution before you move a single entity. The boundaries must be compiler-enforced .csproj files with explicit <ProjectReference> elements. If you extract a beautiful Value Object but it lives in SubscriptionHub.Services, you have improved the code without improving the architecture. The coupling is identical. The next developer will add a dependency from the Value Object to an infrastructure type because nothing prevents it.

Structure first, code second. The structure is the constraint that prevents the code from drifting back into the mud. We learned this in Part VI, and it held true through every subsequent phase.

A practical corollary: create all the context libraries at once, even if you only plan to migrate one context first. The full project graph should be visible from day one. If the team can see Subscriptions.Domain, Billing.Domain, Notifications.Application, and Analytics.Application in the solution explorer, the target architecture is not an abstract diagram on a wiki page -- it is a concrete set of empty folders waiting to be filled. That visibility changes behavior. Developers start thinking "where does this belong?" before they write a line of code.

We created 12 projects in a single PR. The PR changed zero behavior. It added zero code. It just created .csproj files with the right <ProjectReference> elements. That PR was the most impactful change of the entire migration. Everything that followed was filling in a structure that already existed.

3. Event Storming reveals boundaries and internals simultaneously

Even two hours on one God Service is worth it. Bring all four teams. Put sticky notes on a wall. Argue about what happens after PaymentFailed. The arguments are the learning.

In Part III, we discovered that SubscriptionHub's implicit boundaries mapped almost perfectly to team boundaries -- a validation of Conway's Law. But we also discovered that the Billing team's mental model of "subscription" was different from the Subscriptions team's mental model. In Billing, a subscription is a billing schedule with a price and a proration method. In Subscriptions, a subscription is a lifecycle state machine with trials, upgrades, and cancellations. Same word, different models, different bounded contexts. Event Storming made that visible in thirty minutes. Code archaeology would have taken weeks.

Do not underestimate the value of the disagreements that surface during Event Storming. When the Billing team lead said "obviously, cancellation creates a final invoice" and the Subscriptions team lead said "no, cancellation just sets a flag -- invoicing happens on the next billing cycle," that disagreement revealed a real bug. Both paths existed in the code. Which one ran depended on which service handled the cancellation request first. Event Storming did not just discover the boundaries. It discovered the bugs at the boundaries.

4. Anti-Corruption Layers are the constant companion

Every phase needs ACLs. When you create bounded context libraries, you need ACLs at the boundaries where old code talks to new code. When you extract Value Objects, you need mapping layers between the old primitive-based DTOs and the new typed models. When you introduce Domain Events, you need ACLs that translate between the event schema and the legacy event handlers.

ACLs are not a phase. They are a practice. Learn the three shapes -- Facade, Adapter, Translator -- and use them instinctively. Part VII covers the patterns, but the lesson is simpler: every time old meets new, put a wall between them. The wall is the ACL. It keeps the old world's assumptions from infecting the new world's model.

Here is the pattern we used most often during the migration -- the Translator ACL that converts between old DTOs and new domain types:

// Billing.Infrastructure/Acl/LegacySubscriptionTranslator.cs
internal sealed class LegacySubscriptionTranslator
    : ISubscriptionInfoProvider
{
    private readonly LegacySubscriptionClient _client;

    public async Task<BillingSubscriptionInfo> GetForBilling(
        SubscriptionId id)
    {
        // Old world: untyped DTO with string status and decimal price
        var dto = await _client.GetSubscription(id.Value);

        // New world: typed domain model with Value Objects
        return new BillingSubscriptionInfo(
            SubscriptionId.From(dto.Id),
            Money.Create(dto.PriceAmount, dto.PriceCurrency),
            SubscriptionPeriod.Create(dto.PeriodStart, dto.PeriodEnd),
            Enum.Parse<BillingInterval>(dto.BillingInterval));
    }
}

The Translator sits at the boundary. The Billing domain never sees the legacy DTO. The legacy system never knows about Money or SubscriptionPeriod. When the legacy system is eventually decommissioned, you delete the Translator and replace it with a direct call to the Subscriptions context's published interface. The domain model does not change. Only the infrastructure adapter changes. This is the power of the Onion Architecture enforced by the project graph: infrastructure changes are isolated. Domain changes are isolated. They do not bleed into each other because the dependency arrows point inward only.

5. Value Objects are the quickest domain win

Zero schema changes. Zero deployment risk. Immediate type safety. The compiler becomes your reviewer.

When we extracted Money in Part VIII, three proration implementations collapsed into one method on SubscriptionPeriod. Currency mismatch bugs became compile errors. Rounding inconsistencies disappeared because Money.Add() enforces the rounding rule in one place. We did this in a single PR. Nobody noticed. That is the best kind of refactoring -- invisible to the user, transformative for the codebase.

If you do nothing else from this series, extract your Value Objects. You can start tomorrow. You can start in your next PR. The extraction is a net-negative diff: more types, fewer lines of code, fewer branches, fewer bugs. The code shrinks because the duplication disappears.

The ROI is also measurable. Before extracting Money, we had 14 bugs in the issue tracker related to currency mismatch or rounding. After extracting Money, we had zero new bugs in that category for the remaining four months of the migration. Not because we were more careful -- because the type system made currency mismatch impossible. Money.EUR(10).Add(Money.USD(20)) throws CurrencyMismatchException. The bug category was eliminated, not suppressed.

6. TDD makes the migration safer, not slower

The objection we heard most: "We don't have time to write tests and refactor." This gets it backwards. You do not have time to refactor without tests. Every minute spent writing a test is a minute not spent debugging a regression in production at 2am.

TDD is not about testing. It is about specification. When you write ProrateFraction_MidMonth_ReturnsHalf() before implementing ProrateFraction(), you are stating what the method must do. The test is the specification. The implementation is the fulfillment. When the test goes green, you know the specification is met. When you refactor, you know the specification is still met. The tests are not overhead. They are the migration's immune system.

There is also a subtle psychological benefit. TDD gives you momentum. Red, green, refactor. Each cycle takes minutes. Each cycle produces a passing test that did not exist before. After an hour, you have six new tests and a Value Object that encapsulates logic that was scattered across three services. That feeling of forward progress is what keeps the migration going through the weeks when the structural work feels invisible to stakeholders. The test count is the migration's heartbeat.

We tracked the test count on a dashboard. Week 1: 12 tests (6 [Ignore]d). Week 4: 85 tests. Week 8: 210 tests. Week 16: 380 tests. Week 24: 500+ tests. The curve was not linear -- it accelerated. As more Value Objects and Aggregates were extracted, each one was easier to test than the last because the patterns were established. The first Value Object test took thirty minutes to set up. The fiftieth took five minutes. Infrastructure tests in the new bounded contexts were even faster because the domain was pure: no mocks, no containers, just new and Assert.

7. The Strangler Fig with feature flags is the deployment strategy

Never deploy the new path alone. Deploy both paths behind a feature flag. Route 5% of traffic to the new path. Compare outputs. If they match, increase to 50%, then 100%. If they diverge, investigate. If they diverge catastrophically, flip the flag and you are back on the old path in seconds.

This is the Strangler Fig pattern from Part IV. The old vine and the new vine coexist. The old vine is not cut until the new vine can bear the load. The feature flag is the knife -- and you hold it, not the deployment pipeline.

We used this for the Subscription aggregate in Part IX. The old SubscriptionService.ChangePlan() and the new Subscription.ChangePlan() ran side by side for two weeks. We compared outputs on every request. When they matched for 100,000 consecutive requests, we cut over. Total downtime: zero. Total stress: manageable.

The practical shape of the Strangler Fig in code is a service that calls both implementations and compares:

public async Task<Result<PlanChangeResult>> ChangePlan(
    SubscriptionId id, PlanId newPlan)
{
    if (_featureFlags.IsEnabled("use-new-subscription-aggregate"))
    {
        return await _newHandler.Handle(
            new ChangePlanCommand(id, newPlan));
    }

    // Legacy path — will be removed after cutover
    return await _legacyService.ChangePlan(id.Value, newPlan.Value);
}

The if statement is the entire migration risk surface. When you remove it, you remove the legacy path. When you remove the legacy path, you remove the legacy service. When you remove the legacy service, you remove the legacy project. One if statement, three months of gradual rollout, zero production incidents.

The Strangler Fig is not glamorous. It does not produce satisfying "delete the old code" PRs for weeks. But it is the only pattern that lets you migrate a system that generates revenue without risking that revenue. The alternative -- deploying the new path without the old one -- is a bet. The Strangler Fig is a hedge. In production systems, hedges beat bets.

8. Do not migrate everything

This is the lesson that saves you months. Not every part of the system benefits from DDD. Not every bounded context needs Aggregates and Value Objects and Domain Events.

Notifications in SubscriptionHub is a message dispatcher. It receives events, fills templates, and sends emails. There is no complex domain logic. There are no invariants. There are no business rules worth encapsulating in an Aggregate. Making Notifications a full DDD bounded context with a rich domain model would be over-engineering. It is an Application + Infrastructure context. It has handlers and adapters. It does not need a Domain layer. And that is fine.

Analytics is similar. It reads events, projects them into a warehouse, and serves dashboards. The "domain" is projection logic and SQL queries. A flat Application layer with DTOs and query handlers is appropriate. Forcing DDD patterns onto analytics would add complexity without adding clarity.

DDD is a tool. Use it where the domain is complex enough to justify it. Leave the simple stuff simple. The Cynefin framework is a useful heuristic here: DDD patterns shine in the complex domain (where cause and effect are only evident in retrospect, where you need to probe and respond). For the complicated domain (where good practices suffice), simpler patterns are fine. For the obvious domain (where best practices apply), CRUD is correct.

We spent approximately 80% of our migration effort on two bounded contexts: Subscriptions and Billing. Those are the contexts where the domain is genuinely complex -- state machines, proration, dunning, trial conversions, upgrade/downgrade paths. Notifications and Analytics got clean boundaries and proper ACLs, but no rich domain models. The effort was proportional to the complexity. This is a feature, not a bug. DDD is expensive in terms of structural overhead -- more projects, more interfaces, more abstractions. That expense is only justified when the domain complexity warrants it. If your domain is simple, the overhead costs more than the bugs it prevents. If your domain is complex, the overhead pays for itself in the first week when an invariant catches a bug that would have reached production.

9. The hardest part is organizational, not technical

"One database, multiple DbContexts" needs buy-in from the DBA, the infrastructure team, and all four development teams. "Each team owns a bounded context" needs buy-in from management, product, and the teams themselves. "We are going to add 12 new projects to the solution" needs buy-in from everyone who runs dotnet build daily.

The technical patterns are the easy part. They are well-documented, well-understood, and well-supported by the tooling. The hard part is getting four teams to agree that the Billing team should not query the Subscriptions database directly. The hard part is convincing the DBA that four DbContexts against one database is not insanity. The hard part is persuading a senior developer who has been productive in the mud for three years that the new structure is worth the learning curve.

This is why Event Storming matters beyond its technical value. It is a social tool. When all four teams discover the boundaries together, the boundaries have shared ownership. Nobody imposed them. The teams found them. That changes the conversation from "the architects want us to restructure" to "we discovered we have been stepping on each other's toes, and here is how we stop."

One practical tip: present the migration as reducing friction, not as improving architecture. Nobody outside the development team cares about bounded contexts. But everyone cares about fewer merge conflicts, faster onboarding, and independent deployments. Lead with the outcomes, not the patterns. The patterns are the how. The outcomes are the why. Management approves the why.

Another practical tip: involve the DBA early. In our migration, the biggest single point of friction was not technical -- it was the DBA's initial resistance to "four DbContexts on one database." Their concern was legitimate: who owns schema migrations? What happens if two contexts need to migrate the same table? How do you coordinate deploy order? We solved this by making the DBA part of the Event Storming session. When they saw the dependency graph and the merge conflict statistics, the conversation shifted from "why are you doing this" to "how do we make this work." The DBA became one of the migration's strongest advocates.

10. CQRS is the next step, not a prerequisite

We did not introduce Command Query Responsibility Segregation in this series. Deliberately. CQRS is powerful, but it is not necessary for a DDD migration. You can have Aggregates that handle commands and return DTOs for queries. You can have a single DbContext per bounded context that serves both writes and reads. You can add CQRS later, when the read model genuinely diverges from the write model -- when the dashboard needs a denormalized view that the aggregate does not provide.

If we had tried to introduce CQRS alongside bounded contexts, ACLs, Value Objects, Aggregates, and Domain Events, the migration would have collapsed under its own weight. Each phase adds exactly one concept. That discipline is what makes incremental migration possible.

The sign that you need CQRS will be obvious when it arrives: a dashboard query that requires joining five tables, denormalizing three levels of hierarchy, and computing running totals -- all from the same DbContext that serves your aggregate's write operations. At that point, a dedicated read model with its own query-optimized schema is the right move. But that point is after the DDD migration, not during it.

In SubscriptionHub's case, the Analytics context is the closest to needing CQRS. Its queries span aggregates and time ranges in ways that the write model's structure does not serve well. That will be the next migration phase -- but it was not part of this migration. One step at a time.

The broader lesson: resist the urge to add every pattern you know in a single migration. Each pattern has a cost -- structural complexity, learning curve, cognitive overhead. Stack too many patterns and the migration stalls under its own weight. Add one pattern per phase. Let it settle. Let the team internalize it. Then add the next one. Patience is a migration strategy.

11. The Big Ball of Mud returns without enforcement

This is the most important lesson, and it is the bridge to the next section.

You can spend six months migrating to DDD. You can have four bounded contexts, clean dependency arrows, 500 green tests, and a SharedKernel that fits on a napkin. And in month seven, a developer on the Billing team adds a using Subscriptions.Domain.Aggregates; to a billing handler because they need the subscription ID and the fastest path is a direct import.

One using statement. One project reference. One crack in the boundary. In month eight, another developer sees the precedent and adds a second cross-boundary import. "The other handler does it, so it must be okay." By month twelve, there are seven. By month eighteen, a new developer asks, "Why do we have separate projects if they all reference each other?" Six months after that, the boundary is decorative. The mud has returned, wearing a DDD costume.

The project reference graph is the first defense. If Billing.Application does not have a <ProjectReference> to Subscriptions.Domain, the using statement does not compile. But project references are easy to add. A developer adds one to their .csproj, the build passes, the PR gets approved because the reviewer did not check the diff on the project file, and the boundary is breached.

You need a second defense. You need the compiler to enforce the boundaries not just structurally (project references) but semantically (what types can cross which boundaries). You need the architecture to be encoded in the code, not just in the project graph.

Architecture tests (like the NetArchTest examples above) are one layer. They catch violations at test time. But they can be ignored -- a developer can delete the test, or mark it [Ignore], or simply not run the tests locally. The ultimate enforcement is a Roslyn analyzer that emits compiler errors. You cannot ignore a compiler error. You cannot merge a PR that does not compile. The boundary is absolute.

You need code generation.


From Manual DDD to Generated DDD

Everything in this series -- the bounded contexts, the Aggregates, the Value Objects, the Domain Events -- was done manually. We wrote the code by hand. We enforced the rules by convention and code review. We relied on project references to prevent coupling and on discipline to prevent drift.

Manual DDD works. It works well. It transforms legacy codebases into systems you can reason about. But it has a failure mode: it requires permanent discipline. The boundaries are only as strong as the team's commitment to maintaining them. New developers who do not understand the rules will violate them. Tight deadlines will tempt even experienced developers to cut across boundaries "just this once." Over time, entropy wins. This is not a criticism of developers. It is a recognition that conventions maintained by willpower alone have a half-life. That half-life is shorter than you think.

Consider what we had to maintain manually during the migration:

  • Every ValueObject must be a record (immutable, structural equality). Convention.
  • Every Aggregate must have a private constructor and a factory method or builder. Convention.
  • Every domain method must return Result<T> instead of throwing. Convention.
  • Every repository interface must live in the Domain layer, every implementation in Infrastructure. Convention.
  • No Domain class may reference EF Core, no Application class may reference a specific database provider. Convention.
  • Domain events must be published after persistence, not before. Convention.

Six conventions. All reasonable. All documented in the architecture decision records. All explained in the onboarding guide. All violated at some point during the migration, caught by code review, fixed, and violated again by a different developer two weeks later.

We kept a tally. In the six months of migration, code reviews caught 34 convention violations across 4 teams. That is roughly one per week. Each one required a comment, a discussion, a fix, and a re-review. Some were contentious -- "why can't I just return a string here instead of wrapping it in Result?" -- and consumed team energy on enforcement instead of problem-solving.

The convention debt is invisible but real. Every convention is a rule that exists only in human memory and wiki pages. Every convention violation is a tax on the code review process. Every undetected convention violation is a crack that will widen.

What if these were not conventions at all? What if they were compilation rules?

Not through code reviews. Not through architecture decision records. Not through convention. Through attributes that make boundary violations a build error, and a Roslyn source generator that reads those attributes and produces both the infrastructure code and the analyzers that guard the boundaries.

This is not theoretical. It is the approach behind the [DDD DSL](#content/blog/ddd.md::DDD as a Modeling Discipline) in the Content Management Framework -- a set of C# attributes that express DDD concepts (Aggregate, Entity, Value Object, Domain Event) and a source generator that produces builders, EF configurations, validators, and boundary analyzers from those attributes.

Here is the Subscription aggregate from Part IX, decorated with DDD DSL attributes:

[Aggregate("Subscription")]
public partial class Subscription
{
    [AggregateRoot]
    [Property("Id", "SubscriptionId", IsKey = true)]
    public partial SubscriptionId Id { get; }

    [Property("Status", "SubscriptionStatus")]
    public partial SubscriptionStatus Status { get; }

    [ValueObject]
    [Property("Period", "SubscriptionPeriod")]
    public partial SubscriptionPeriod CurrentPeriod { get; }

    [ValueObject]
    [Property("Price", "Money")]
    public partial Money Price { get; }

    [Entity]
    [Property("UsageRecords", "IReadOnlyList<UsageRecord>")]
    public partial IReadOnlyList<UsageRecord> UsageRecords { get; }

    [DomainEvent]
    public partial PlanChangedEvent WhenPlanChanged(PlanId newPlan, Money newPrice);

    [DomainEvent]
    public partial SubscriptionCancelledEvent WhenCancelled(string reason);

    [Invariant("Active subscriptions must have a current period")]
    private bool HasCurrentPeriod() => Status != SubscriptionStatus.Active
        || CurrentPeriod is not null;

    [Invariant("Price must be non-negative")]
    private bool PriceIsNonNegative() => Price.Amount >= 0;
}

That is the entire domain model definition. From these attributes, the source generator produces:

  • Builder pattern for constructing aggregates in tests (Subscription.Builder().WithPrice(Money.EUR(29.99m)).Build())
  • EF Core configurations (IEntityTypeConfiguration<Subscription>) with owned entity mappings for Value Objects, private setter access, and typed ID conversions
  • Repository interface (ISubscriptionRepository) with standard CRUD plus domain-specific query methods
  • Validation logic that runs every [Invariant] on state transitions and returns Result<T> instead of throwing
  • Event dispatch wiring that publishes domain events after successful persistence
  • Compile-time boundary checks -- if a class in Billing.Domain references Subscription without going through an ACL, the analyzer emits a diagnostic error
  • Architecture documentation -- the generator can emit a Mermaid diagram of the aggregate structure, always in sync with the code, because it is the code

The developer writes 30 lines of attributed C#. The source generator produces 400 lines of infrastructure code. The infrastructure is correct by construction -- it cannot drift from the model because it is derived from the model. When the model changes -- a new property, a new invariant, a new event -- the generated code changes automatically on the next build. No manual synchronization. No "update the EF config to match the entity" step that someone forgets.

But the real value is not productivity. It is enforcement.

If a developer adds a public setter to a Value Object property, the analyzer emits DDD001: Value Objects must be immutable. Build fails.

If a developer adds a navigation property from Invoice (in Billing.Domain) to Subscription (in Subscriptions.Domain), the analyzer emits DDD002: Cross-aggregate navigation properties are not allowed. Build fails.

If a developer creates an entity without designating an aggregate root, the analyzer emits DDD003: Every aggregate must have exactly one root. Build fails.

If a developer calls subscription.Status = SubscriptionStatus.Cancelled instead of subscription.Cancel(), the analyzer emits DDD004: State changes must go through domain methods. Build fails.

Four rules. Four error codes. Each one prevents a category of bugs that code reviews catch inconsistently and manual conventions prevent never. The rules are not punitive -- they are instructive. Each error message tells the developer what to do instead. The compiler is not just saying "no." It is saying "not that way -- this way."

Here is what that looks like in practice. A new developer joins the Billing team. They need to check whether a subscription is on a trial. The fastest path is to add a project reference to Subscriptions.Domain and query the aggregate directly:

// Billing.Application/Handlers/GenerateInvoiceHandler.cs
using Subscriptions.Domain.Aggregates; // <-- the breach

public class GenerateInvoiceHandler
{
    private readonly ISubscriptionRepository _subscriptions; // <-- wrong context

    public async Task<Result<Invoice>> Handle(GenerateInvoiceCommand cmd)
    {
        var sub = await _subscriptions.GetById(cmd.SubscriptionId);
        if (sub.IsOnTrial) // <-- cross-boundary query
            return Result.Skip("Trial subscriptions are not billed");
        // ...
    }
}

Without code generation, this compiles. The code review might catch it, might not. With the DDD analyzer, dotnet build produces:

error DDD002: Type 'Subscription' from aggregate 'Subscriptions' cannot be
              referenced from bounded context 'Billing'. Use an ACL or
              domain event instead.
  --> Billing.Application/Handlers/GenerateInvoiceHandler.cs:2

The build fails. The developer learns the pattern. They subscribe to TrialStartedEvent instead, or they query through a Billing-owned read model that the Subscriptions context populates via events. The boundary is preserved not by discipline but by the toolchain. The developer learns the correct pattern in seconds, not after a code review round-trip that takes hours. The feedback loop is immediate, precise, and educational.

The boundaries that we spent six months establishing in this series become permanent. Not permanent because the team agreed to maintain them. Permanent because the compiler rejects code that violates them. The Big Ball of Mud cannot return, because the build system will not let it compile. The entropy that created the original mud ball -- reasonable local decisions compounding into unreasonable global architecture -- is blocked at the compiler level. Each local decision is checked against the global rules. The compiler is the architect that never goes on vacation.

Diagram

Manual DDD requires discipline. Generated DDD requires only attributes. The compiler does the rest.

The transition from manual to generated DDD is itself incremental. You do not rewrite your aggregates overnight. You add [Aggregate] to one class. You see the builder and EF config appear in the generated output. You verify they match your hand-written versions. You delete the hand-written versions. You move to the next aggregate. One class at a time, the codebase shifts from hand-maintained infrastructure to generated infrastructure. The domain model stays the same. The attributes are additive. The generated code is a superset of what you wrote by hand.

This is the bridge from this series to the broader work on model-driven engineering and the Content Management Framework. The DDD patterns we applied manually in this migration -- Aggregates, Value Objects, Entities, Domain Events, Repositories -- are formalized as M2 metamodel concepts in the CMF's [DDD DSL](#content/blog/ddd.md::DDD as a Modeling Discipline). The source generator is the M2-to-M0 compiler: it reads the attributed model (M1) and produces running code (M0). The quality gates ensure that the generated code meets the same standards as hand-written code.

There is a philosophical point here worth making explicit. DDD is often presented as a design methodology -- a way to think about software before you write it. And it is. But when DDD concepts become formal metamodel elements backed by source generation, DDD becomes something more: it becomes a constraint system. The constraints are not just social ("we agree that Aggregates should enforce invariants") or architectural ("the project graph prevents cross-context references"). They are computational. The compiler understands the domain model. The compiler enforces the domain rules. The compiler is the DDD police, and it never sleeps, never gets tired, and never approves a PR "just this once."

This is the natural endpoint of the journey from mud to model. First you discover the model (Event Storming). Then you implement the model (manual DDD). Then you encode the model (source generation). Each step makes the model more real, more enforceable, more durable.

The journey from Big Ball of Mud to DDD is a journey from implicit to explicit. From scattered rules to encapsulated invariants. From shared mutable state to isolated bounded contexts. From "I think this is how it works" to "the test proves this is how it works."

Code generation is the final step of that journey: from "we agree to maintain these boundaries" to "the compiler maintains these boundaries for us."

But let me be clear: you do not need code generation to succeed with DDD. This series did not use it. SubscriptionHub was migrated entirely with manual code, manual conventions, and manual code reviews. The migration succeeded. The system is better. The teams are happier. Code generation is the next step -- the step that makes the success permanent without requiring permanent vigilance.

If you are starting a DDD migration today, start manual. Learn the patterns by hand. Feel the friction of maintaining conventions. Understand why a Value Object must be immutable before you let the compiler enforce it for you. Then, when the domain model is stable and the team has internalized the rules, add the attributes. Let the generator take over the enforcement. Your hands are free. The compiler is watching.

The migration from manual DDD to generated DDD is itself a migration from mud to model -- just at a higher level of abstraction. First you had implicit conventions maintained by willpower. Then you have explicit rules maintained by the compiler. The pattern repeats because the principle is universal: make the implicit explicit, then make the explicit enforced.

And that principle is the thread that runs through the entire series. Event Storming makes implicit boundaries explicit. Bounded context libraries make explicit boundaries enforced. Value Objects make implicit domain concepts explicit. Aggregates make implicit invariants enforced. Domain Events make implicit communication explicit. Code generation makes explicit conventions enforced. Every step, the same motion: find what is hidden, name it, encode it, enforce it. That is the entire methodology, distilled to one sentence.


What Is Left

Before the call to action, an honest accounting of what is not done.

SubscriptionHub is not a textbook DDD codebase. There are still areas of technical debt. The Analytics context has projection logic that would benefit from a proper read model. The Notifications context has email templates embedded in C# string literals that should be externalized. The SharedKernel's Result<T> type does not yet support monadic composition (.Bind(), .Map()) -- we use if checks, which is verbose but readable. The Transactional Outbox from Part X works but uses a polling mechanism that adds latency; a push-based approach with database notifications would be better.

These are known issues. They are documented. They are prioritized. And critically, they are addressable -- because each one is scoped to a single bounded context, a single layer, a single concern. In the old system, every improvement was entangled with everything else. In the new system, each improvement is isolated. The Analytics team can add a read model without affecting Billing. The Platform team can externalize templates without affecting Subscriptions. The migration gave us the ability to improve incrementally. It did not give us perfection. Perfection is not the goal. Evolvability is.


Start Tomorrow

Let me circle back to where we started.

In Part I, I wrote: "Nobody designs a Big Ball of Mud. It grows." That is still true. And it is not a moral failing. The Big Ball of Mud is what happens when you ship software successfully for years without formalizing the boundaries that emerged during those years. SubscriptionHub made money. It served customers. It survived six years and four teams. That is a success -- a success that created a maintenance problem, but a success nonetheless.

The question is not whether you have legacy code. Every team that has shipped software for more than two years has legacy code. The question is whether you have a strategy for evolving it. Whether the next feature makes the codebase slightly better or slightly worse. Whether entropy is accelerating or decelerating. Whether you are adding to the mud or extracting from it.

There is a principle from the Boy Scout Rule: "Leave the code better than you found it." DDD gives that principle teeth. Every PR is an opportunity to extract a Value Object, formalize an ACL, or move an entity into its bounded context. You do not need a migration project. You need a migration habit.

DDD provides the tools:

  • Event Storming for discovering the boundaries that already exist in your team's mental models.
  • Bounded Context Libraries for making those boundaries compile-time constraints that the project graph enforces.
  • Anti-Corruption Layers for keeping old and new code coexisting without contaminating each other.
  • Value Objects for replacing primitive obsession with a typed vocabulary that the compiler checks.
  • Aggregates for encapsulating business rules in domain objects instead of scattering them across service classes.
  • Domain Events for decoupling contexts so they communicate without coupling.
  • Tests for the confidence to refactor without fear.
  • Code Generation (eventually) for making the boundaries permanent without requiring permanent vigilance.

The migration is incremental. Each phase delivers value. Each phase makes the next one easier. You do not need a rewrite budget. You do not need management approval for a six-month project. You need a two-hour Event Storming session and a new .csproj.

And here is the thing nobody tells you about DDD migrations: the hardest step is the first one. Not technically -- technically, creating an empty .csproj is trivial. Emotionally. Admitting that the current architecture is not serving the team. Admitting that the system you built and maintained for years needs structural work. Admitting that "we've always done it this way" is not a reason to keep doing it this way.

The Big Ball of Mud is not a moral failing. It is the natural state of software that has been shipping successfully for years. But choosing to leave it as-is, when you have the tools and the knowledge to evolve it, is a choice. You now have the tools. You have the knowledge. This series gave you both -- the strategic patterns for discovery and planning, and the tactical patterns for execution.

Make the other choice.

Start tomorrow.

Not next sprint. Not after the next release. Not when things calm down. Tomorrow. Because things will not calm down. There will always be a feature to ship, a bug to fix, a deadline to meet. The Big Ball of Mud grows in the gaps between those deadlines. If you wait for a gap large enough to "do it properly," you will wait forever. The migration must happen alongside feature delivery, not instead of it.

Here is a concrete five-step plan for your first week:

Day 1: Book a 2-hour Event Storming session on your worst God Service. The one with the most lines, the most dependencies, the most merge conflicts. Put sticky notes on a wall. Invite everyone who touches that code. Argue about what happens after PaymentFailed. Discover the bounded contexts hiding inside the 2,400-line class. You will not finish in two hours. You will have enough to start.

Day 2: Create the first bounded context library. Billing.Domain. Zero NuGet packages. One folder. One namespace. Do not move any code yet. Just create the .csproj and add it to the solution. The boundary now exists in the project graph, even if it is empty. That empty project is the most important artifact of the migration.

Day 3: Write one characterization test. Pick the God Service's most important public method. Capture its output for one representative input. Store the output as a snapshot. Assert against it. You now have a safety net for one scenario. It is better than the zero scenarios you had before.

Day 4: Move one entity. Pick the simplest entity in the God Service's DbContext. Move it to the new Domain library. Update the project references. Run the characterization test. It should still pass. If it does not, you learned something important about hidden coupling.

Day 5: Extract one Value Object. Money. Two fields -- Amount and Currency -- become one record. Write the test first: Money.Add(Money.EUR(10), Money.EUR(20)) should equal Money.EUR(30). Implement. Map the entity properties. Run the characterization test. Commit.

One week. Five small changes. No production risk. No rewrite budget. No permission needed.

By the end of that first week, you will have something the Big Ball of Mud never had: a direction. An empty project structure that says "this is where we are going." A characterization test that says "this is what we must not break." An entity in a domain library that says "this is how the future looks." A Value Object that says "this is what the domain really means."

That direction, once established, is self-reinforcing. Every new feature becomes a choice: add it to the old God Service, or add it to the new bounded context. The new bounded context is easier to test, easier to reason about, and easier to review. Developers choose it naturally. The Strangler Fig grows not because you mandate it, but because the new structure is better.

Three months in, something shifted. New features started being designed for the bounded context structure. Developers were not just migrating old code -- they were writing new code that belonged in bounded contexts from the start. A billing developer proposed a new dunning strategy and, without being asked, opened the PR with a test in Billing.Domain.Tests, an aggregate method on the DunningSchedule Value Object, and a handler in Billing.Application. No one told them to structure it that way. The structure suggested it. The tests rewarded it. The code review approved it in ten minutes.

That is the moment the migration wins. Not when the last legacy service is deleted. When the first new feature is designed for the new structure without anyone having to ask.

The journey of a thousand refactorings begins with a single .csproj.


The Full Journey

Looking back, the series tells a story in three acts. Act I (Parts I-IV) is diagnosis and planning: understand the disease, name the pathologies, discover the boundaries, plan the strategy. No code changes. Only understanding. Act II (Parts V-VIII) is the structural foundation: write the safety net, create the projects, fix the boundaries, extract the vocabulary. The system is better, but the behavior has not changed. The same code runs; it just lives in better places. Act III (Parts IX-XI) is the transformation: build the aggregates, introduce the events, assess the final state. Behavior changes. The domain model is alive. The old services are dead.

Each act builds on the one before. You cannot do Act II without the understanding from Act I. You cannot do Act III without the structure from Act II. The ordering is not arbitrary. It is the ordering that minimizes risk at every step.

If you read only one act, read Act I. Understanding the disease is more valuable than knowing the cure. If you understand why the mud exists, you can prevent the next generation of mud from growing. If you only know the tactical patterns without understanding the forces that create legacy code, you will apply the patterns to the wrong problems -- or worse, apply them correctly and then watch the gains erode because the organizational forces that created the mud are still operating.

This is the path we walked, from diagnosis to the final state:

Part Title What We Did
Part I The Disease Met SubscriptionHub. Named the Big Ball of Mud. Understood why it grew.
Part II The Six Pathologies Diagnosed God Services, anemic models, scattered rules, infrastructure coupling, leaky boundaries, no aggregate boundaries.
Part III Discovering Boundaries Applied Big Picture and Design-Level Event Storming. Found 4 bounded contexts, named the aggregates, discovered the events.
Part IV The Migration Strategy Chose the Strangler Fig. Defined 6 phases. Established the principle: each phase delivers independently.
Part V Tests First Wrote characterization tests. Built the golden master. Established the safety net for everything that follows.
Part VI Bounded Context Libraries Created 12 projects (4 contexts x 3 layers). Killed Common. Split the DbContext. Made the compiler enforce boundaries.
Part VII Fix and Formalize ACLs Wrapped Stripe behind a domain interface. Gave Notifications its own read models. Stopped the bleeding at every boundary crossing.
Part VIII Extract Value Objects Extracted Money, SubscriptionPeriod, EmailAddress, TaxRate under TDD. Three proration bugs became one correct method.
Part IX Build Aggregates Built the Subscription aggregate with invariants, typed IDs, Result returns, and domain events. Strangler Fig with feature flags.
Part X Domain Events Broke circular dependencies with events. PlanChangedEvent, PaymentFailedEvent, InvoiceGeneratedEvent. Transactional Outbox for reliable cross-context communication.
Part XI The Final State Before and after across 12 dimensions. 11 hard-won lessons. The bridge to code generation and permanent enforcement.

Eleven parts. Six phases. One system transformed.

SubscriptionHub still makes money. It still serves customers. It still ships features. But the next feature takes days instead of weeks. The next bug is found in one file instead of three. The next deploy affects one bounded context instead of the entire system. The next new hire reads one context instead of the whole codebase.

SubscriptionHub started as 80,000 lines across 7 projects with no boundaries. It ended as approximately 60,000 lines across 15 projects with strict boundaries. Twenty thousand lines disappeared. The line count went down because duplication was eliminated -- three proration implementations became one, four validation methods became one aggregate, 47 Common classes revealed 23 that nobody used. Twelve hundred lines of mapping code evaporated when Value Objects made the mapping unnecessary. Eight hundred lines of null checks vanished when typed IDs and Result returns made null states unrepresentable. The project count went up because structure has weight. That is the right trade: fewer lines, more projects, each project doing exactly one thing.

The Big Ball of Mud grew because the system succeeded. The domain model emerged because the team decided that success and structure are not opposites.

They are not. They never were.

The Big Ball of Mud is dead. Long live the domain model.