Run a phased migration
Move from old to new in controlled slices, where each slice has explicit ownership, cutover criteria, rollback, and retirement of the old path.
- Situation
- You need to replace or move a live system without stopping delivery.
- Goal
- Reduce migration risk by replacing behavior incrementally instead of betting everything on one cutover.
- Do not use when
- the target system is still conceptually undefined
- Primary owner
- tech lead
- Roles involved
tech leadarchitectdelivery leadservice ownerQA or quality leadoperations or platform ownerproduct owner if user-facing impact exists
Context
The situation
Deciding whether to reach for this playbook: when it fits, and when it doesn't.
Use when
Conditions where this playbook is the right tool.
- A legacy system must be replaced or decomposed
- A monolith capability is moving into a new service or platform
- A schema, provider, or runtime migration is unavoidable
- The existing system is painful but still business-critical
Do not use when
Contexts where this playbook will waste effort or make things worse.
- The target system is still conceptually undefined
- No one can name the first migration slice
- The team is secretly trying to bundle a rewrite, redesign, and platform change together
- There is no realistic way to run old and new behavior side by side, even temporarily
Stakes
Why this matters
What this playbook protects against, and why skipping or half-running it tends to be expensive.
Most migrations fail because ambition outruns displacement. A phased migration keeps the team focused on moving real behavior, not just producing new code.
Quality bar
What good looks like
The observable qualities of a team or system that is actually doing this well. Not just going through the motions.
Signs of the playbook done well
- The migration is divided into business-meaningful slices
- Old and new paths are both observable during transition
- Every slice has explicit entry and exit criteria
- Teams can say what was displaced, not just what was built
- The legacy surface shrinks over time in visible ways
Preparation
Before you start
What you need available and true before running the procedure. Skipping this is the most common reason playbooks fail.
Inputs
Material you'll want to gather first.
- Current system map
- Dependency inventory
- Business-critical workflows
- Cutover constraints
- Rollback options
- Operational readiness expectations
Prerequisites
Conditions that should be true for this to work.
- Shared understanding of why migration is needed
- Minimum observability on the current path
- Named owners for both source and target behavior
- Agreement on what counts as displacement
Procedure
The procedure
Each step carries its purpose (why it exists), its actions (what you do), and its outputs (what you produce). Read the purpose. It's what keeps the step from degenerating into checklist theatre.
Define the migration unit
Turn the migration into slices that move real behavior instead of abstract layers.
Actions
- Map the current system by business capability, not only by code structure
- Identify the smallest slice that can be moved and verified independently
- Write down what remains in the old system after that slice moves
Outputs
- Migration slice map
- First slice definition
Make legacy behavior explicit
Avoid losing hidden behavior that nobody remembered until production proves it mattered.
Actions
- Capture current inputs, outputs, side effects, edge cases, and operational expectations
- Review incident history for hidden assumptions
- Identify consumers that depend on undocumented behavior
Outputs
- Behavior inventory
- Known edge-case list
Design cutover and coexistence
Ensure the team can compare, phase, and reverse safely.
Actions
- Define how old and new paths will coexist
- Decide what traffic, data, or users move first
- Define rollback conditions and rollback mechanics
Outputs
- Cutover strategy
- Rollback approach
- Coexistence model
Move one slice and retire one slice
Prevent endless parallel ownership.
Actions
- Implement the target slice
- Validate parity and operational behavior
- Retire or isolate the old slice explicitly once confidence is sufficient
Outputs
- Migrated slice
- Legacy retirement note
Measure displacement, not activity
Keep the program honest.
Actions
- Track legacy dependency reduction
- Track which users, workflows, or traffic moved
- Review whether old operational load actually fell
Outputs
- Displacement dashboard
- Migration progress review
Judgment
Judgment calls and pitfalls
The places where execution actually diverges: decisions that need thought, questions worth asking, and mistakes that recur regardless of good intent.
Decision points
Moments where judgment and trade-offs matter more than procedure.
- What is the first slice: technical seam or business capability?
- Do we mirror behavior first or improve behavior during the move?
- Can old and new paths run in parallel, or do we need progressive cutover?
- What level of parity is required before retirement?
Questions worth asking
Prompts to use on yourself, the team, or an AI assistant while running the procedure.
- What is the smallest migration slice that removes real legacy responsibility?
- What hidden behaviors does the current system provide that nobody wrote down?
- How will we prove this slice is displaced rather than duplicated?
Common mistakes
Patterns that surface across teams running this playbook.
- Making the slice too technical and not user- or behavior-meaningful
- Adding net-new product ambition before parity is reached
- Measuring generated code instead of displaced legacy behavior
- Keeping the old path alive indefinitely out of vague caution
- Assuming undocumented legacy behavior is accidental
Warning signs you are doing it wrong
Signals that the playbook is being executed but not landing.
- Nobody can name what has been retired so far
- The new system keeps expanding in scope while the old one stays fully alive
- Migration status sounds impressive but no user or workload has actually moved
- Teams describe the migration using architecture language only, not business behavior
Outcomes
Outcomes and signals
What should exist after the playbook runs, how you'll know it worked, and what to watch for over time.
Artifacts to produce
Durable outputs the playbook should leave behind.
- Migration slice map
- Behavior parity checklist
- Cutover plan
- Rollback plan
- Legacy retirement record
Success signals
Observable changes that mean the playbook landed.
- Specific legacy paths are retired on a steady cadence
- Support and operational burden drops on the source system
- Cutover decisions are made with evidence, not optimism
- Stakeholders can see what has actually moved
Follow-up actions
Moves that keep the playbook's effects compounding after it finishes.
- Clean up coexistence code quickly after each slice
- Review whether the target design still fits reality after each major slice
- Update ownership and runbooks as the center of gravity shifts
Metrics or signals to watch
Longer-horizon indicators that the underlying problem is receding.
- Legacy dependency count
- Percentage of traffic or workflows on new path
- Rollback frequency
- Incidents caused by parity gaps
- Time spent maintaining dual paths
AI impact
AI effects on this playbook
How AI-assisted and AI-driven workflows help execution, and the ways they can make it worse.
AI can help with
Where AI tooling genuinely reduces the cost of running this playbook well.
- Mapping legacy code paths and dependencies
- Summarizing incident history around the old system
- Generating migration checklists and parity matrices
- Finding likely consumers of hidden behaviors
AI can make worse by
Distortions AI introduces that make the underlying problem harder to see.
- Accelerating speculative replacement code before behavior is understood
- Making parallel systems grow faster than the team can validate
- Creating false confidence through polished migration documentation
AI synthesis
Use AI heavily for discovery, inventory, and comparison. Use it cautiously for replacement logic unless behavior is already well bounded.
Relationships
Connected playbooks
Failure modes this playbook tends to address, decisions behind the situation, red flags that motivate running it, and neighboring playbooks.