Skip to main content
The Hard Parts.dev
EP-18 Delivery EP Engineering Playbook
Difficulty high Owner · architect

Start a rewrite safely

Before approving a rewrite, force clarity on the actual problem, what must be preserved, what will be displaced first, and how success will be measured beyond cleaner code.

Difficulty
high
Time horizon
2 to 6 weeks to frame safely before broad commitment
Primary owner
architect
Confidence
high
At a glanceEP-18
Situation
The current system is painful enough that a rewrite is being seriously considered.
Goal
Prevent a justified frustration from becoming an uncontrolled second system.
Do not use when
the rewrite is mostly a morale move
Primary owner
architect
Roles involved

tech leadarchitectengineering managerproduct owner or sponsorsenior maintainers of legacy system

Context

The situation

Deciding whether to reach for this playbook: when it fits, and when it doesn't.

Use when

Conditions where this playbook is the right tool.

  • The team keeps proposing a fresh start
  • Maintenance pain is real and recurring
  • Architectural debt is affecting delivery materially
  • The current system resists safe change

Stakes

Why this matters

What this playbook protects against, and why skipping or half-running it tends to be expensive.

Rewrites are seductive because pain is real. The failure usually comes from vague success criteria, hidden parity obligations, and optimism about forgotten legacy behavior.

Quality bar

What good looks like

The observable qualities of a team or system that is actually doing this well. Not just going through the motions.

Signs of the playbook done well

  • The rewrite has a sharply defined problem statement
  • The first displaced slice is known before broad implementation begins
  • The team can explain what will not be rebuilt
  • The old system is treated as a behavior inventory, not only a code smell
  • Leadership understands cost, overlap, and coexistence risk

Preparation

Before you start

What you need available and true before running the procedure. Skipping this is the most common reason playbooks fail.

Inputs

Material you'll want to gather first.

  • Top maintenance pain points
  • Incident history
  • Dependency map
  • Delivery friction analysis
  • Stakeholder expectations
  • Legacy behavior inventory

Prerequisites

Conditions that should be true for this to work.

  • Honest diagnosis of current pain
  • Explicit rewrite sponsor
  • Access to people who know the old system well
  • Willingness to reject the rewrite if the case is weak

Procedure

The procedure

Each step carries its purpose (why it exists), its actions (what you do), and its outputs (what you produce). Read the purpose. It's what keeps the step from degenerating into checklist theatre.

  1. Name the actual reasons

    Separate real causes from emotional shorthand.

    Actions

    • List the concrete failures of the current system
    • Group them into structure, delivery, operations, and ownership issues
    • Test whether each issue truly requires a rewrite or could be addressed incrementally

    Outputs

    • Rewrite problem statement
    • Pain-to-cause map
  2. Inventory what must survive

    Expose the hidden parity burden early.

    Actions

    • Identify critical workflows, edge cases, contracts, and operational dependencies
    • Review legacy incidents for behaviors that matter more than code elegance
    • Capture business behaviors that users assume even if engineers dislike them

    Outputs

    • Must-preserve inventory
    • Legacy behavior map
  3. Define first displacement before full build

    Prevent open-ended replacement efforts.

    Actions

    • Name the first slice that will move and how it will be proven live
    • Define the minimum architecture needed for that slice
    • Write down what is intentionally not in scope

    Outputs

    • First displacement slice
    • Excluded scope list
  4. Constrain ambition

    Stop the rewrite becoming a wishlist.

    Actions

    • Separate parity work from improvement work
    • Ban unrelated modernization goals unless they are required by the slice
    • Set explicit criteria for when net-new scope may be added

    Outputs

    • Rewrite guardrails
    • Scope constraints
  5. Approve only with a migration model

    Tie the rewrite to displacement reality.

    Actions

    • Show how the old system shrinks over time
    • Define coexistence and rollback assumptions
    • Agree on success metrics tied to retirement, not output

    Outputs

    • Rewrite approval brief
    • Migration model

Judgment

Judgment calls and pitfalls

The places where execution actually diverges: decisions that need thought, questions worth asking, and mistakes that recur regardless of good intent.

Decision points

Moments where judgment and trade-offs matter more than procedure.

  • Is this truly a rewrite candidate or a refactor candidate?
  • What are the non-negotiable legacy behaviors?
  • What is the first slice that proves the rewrite is real?
  • What scope is explicitly banned until parity is proven?

Questions worth asking

Prompts to use on yourself, the team, or an AI assistant while running the procedure.

  • What exact pain are we rewriting to solve?
  • What legacy behaviors would break customers if we forgot them?
  • What is the first thing we will actually retire?

Common mistakes

Patterns that surface across teams running this playbook.

  • Starting architecture work before defining the first displacement slice
  • Using the rewrite to also fix every adjacent problem
  • Treating ugly legacy behavior as unimportant because it is hard to defend aesthetically
  • Telling leadership the rewrite will simplify everything quickly

Warning signs you are doing it wrong

Signals that the playbook is being executed but not landing.

  • The rewrite is described as cleanup, modernization, platform reset, and product acceleration all at once
  • The team cannot state what will be turned off first
  • The architecture is getting clearer faster than the migration path
  • Legacy behavior is dismissed with phrases like 'we probably do not need that'

Outcomes

Outcomes and signals

What should exist after the playbook runs, how you'll know it worked, and what to watch for over time.

Artifacts to produce

Durable outputs the playbook should leave behind.

  • Rewrite problem statement
  • Must-preserve inventory
  • First-slice plan
  • Scope guardrails
  • Migration approval brief

Success signals

Observable changes that mean the playbook landed.

  • The rewrite has a sharply limited early scope
  • Stakeholders understand what will not be rebuilt yet
  • The team can point to explicit displacement milestones
  • The rewrite is framed as migration, not just construction

Follow-up actions

Moves that keep the playbook's effects compounding after it finishes.

  • Review the rewrite case after the first slice rather than locking all assumptions up front
  • Kill or reduce the rewrite if the first displacement proves weaker than expected
  • Refresh the preserve-inventory after every major incident or discovery

Metrics or signals to watch

Longer-horizon indicators that the underlying problem is receding.

  • Time to first displaced slice
  • Ratio of parity work to net-new ambition
  • Number of legacy behaviors discovered late
  • Dual-run duration

AI impact

AI effects on this playbook

How AI-assisted and AI-driven workflows help execution, and the ways they can make it worse.

AI can help with

Where AI tooling genuinely reduces the cost of running this playbook well.

  • Summarizing maintenance pain from historical tickets and incidents
  • Mapping legacy dependencies and hidden contracts
  • Highlighting code hotspots and behavior clusters
  • Drafting parity and cutover checklists

AI can make worse by

Distortions AI introduces that make the underlying problem harder to see.

  • Making it too easy to generate a large shiny replacement quickly
  • Giving teams false confidence that understanding has caught up with output
  • Hiding ambiguity behind confident rewrite documents

Relationships

Connected playbooks

Failure modes this playbook tends to address, decisions behind the situation, red flags that motivate running it, and neighboring playbooks.