Skip to main content
The Hard Parts.dev
EP-09 Architecture EP Engineering Playbook
Difficulty high Owner · architect

Review service boundaries

Review service boundaries by looking at change patterns, ownership reality, dependency shape, and runtime behavior - not just diagrams or intended architecture.

Difficulty
high
Time horizon
days to weeks depending on system size
Primary owner
architect
Confidence
high
At a glanceEP-09
Situation
You need to evaluate whether current service boundaries are helping or hurting.
Goal
Determine whether current boundaries are reducing complexity, clarifying ownership, and localizing change, or whether they are creating coupling, coordination cost, and fuzzy responsibility.
Do not use when
the system is too new to have meaningful change and incident history
Primary owner
architect
Roles involved

architecttech leadservice ownersstaff engineersdelivery lead when change cost mattersplatform or operations partner when runtime behavior matters

Context

The situation

Deciding whether to reach for this playbook: when it fits, and when it doesn't.

Use when

Conditions where this playbook is the right tool.

  • Simple changes require coordination across too many services or teams
  • Ownership is unclear at service edges
  • Integration incidents are common
  • Teams debate whether a boundary is right but lack evidence
  • A monolith split or service consolidation is being considered

Stakes

Why this matters

What this playbook protects against, and why skipping or half-running it tends to be expensive.

Bad service boundaries create invisible tax: broader changes, slower delivery, more coordination, hidden duplication, and weaker accountability. Good boundaries reduce cognitive load and make failure and change more local.

Quality bar

What good looks like

The observable qualities of a team or system that is actually doing this well. Not just going through the motions.

Signs of the playbook done well

  • The team can explain what each service owns in business terms
  • Common changes stay local more often than not
  • Consumers depend on explicit contracts rather than accidental behavior
  • Ownership and operational accountability align with the service boundary
  • Boundary changes are justified by evidence, not fashion

Preparation

Before you start

What you need available and true before running the procedure. Skipping this is the most common reason playbooks fail.

Inputs

Material you'll want to gather first.

  • Service inventory
  • Change history across services
  • Incident and dependency history
  • Team ownership map
  • API or event contracts
  • Architecture diagrams if available

Prerequisites

Conditions that should be true for this to work.

  • You can identify current service owners and consumers
  • You have access to recent change and incident history
  • The review is allowed to challenge the current boundary model honestly

Procedure

The procedure

Each step carries its purpose (why it exists), its actions (what you do), and its outputs (what you produce). Read the purpose. It's what keeps the step from degenerating into checklist theatre.

  1. State what each service is supposed to own

    Compare intended boundaries with real ones.

    Actions

    • Describe each service in one sentence using business behavior, not tech stack
    • List what it explicitly owns and what it should not own
    • Identify where different teams describe the same service differently

    Outputs

    • Service purpose inventory
  2. Look at change fan-out and coordination cost

    Use actual change behavior as evidence.

    Actions

    • Review common changes from the last few months
    • Measure how many services and teams a typical change crosses
    • Identify recurring cross-boundary edits that look structural, not incidental

    Outputs

    • Change fan-out map
  3. Inspect runtime and contract coupling

    Find where boundaries look clean in diagrams but not in operation.

    Actions

    • Review dependencies, shared schemas, hidden data assumptions, and retry or failure patterns
    • Identify contracts that are implicit, brittle, or operationally expensive
    • Check whether services depend on implementation detail rather than declared interface

    Outputs

    • Coupling and contract assessment
  4. Check ownership fit

    Make sure service boundaries support real accountability.

    Actions

    • Ask who owns roadmap, incidents, operability, and compatibility for each service
    • Identify boundaries where responsibility and authority do not match
    • Note where a service looks shared but is really owned through heroics or hidden escalation

    Outputs

    • Ownership fit review
  5. Recommend targeted boundary moves

    Avoid vague conclusions like re-architect more.

    Actions

    • Name which boundaries should stay, shift, merge, split, or be made more explicit
    • Tie each recommendation to change locality, contract health, or ownership clarity
    • Sequence the changes in small, evidence-based steps

    Outputs

    • Boundary recommendation set
    • Next-step architecture plan

Judgment

Judgment calls and pitfalls

The places where execution actually diverges: decisions that need thought, questions worth asking, and mistakes that recur regardless of good intent.

Decision points

Moments where judgment and trade-offs matter more than procedure.

  • Is the problem really the service boundary, or the contract and ownership around it?
  • Should two services merge, or should their interface become cleaner?
  • Would a modular monolith boundary serve this responsibility better?
  • Which changes are worth making now versus watching longer?

Questions worth asking

Prompts to use on yourself, the team, or an AI assistant while running the procedure.

  • What common changes cross these boundaries today?
  • Which service owns this business behavior end to end?
  • Are we paying more in coordination than we gain in separation?

Common mistakes

Patterns that surface across teams running this playbook.

  • Reviewing boundaries from diagrams only
  • Treating all cross-service traffic as proof boundaries are wrong
  • Optimizing for theoretical purity over real ownership and change cost
  • Using service count as a maturity proxy
  • Deciding to split or merge without looking at change history

Warning signs you are doing it wrong

Signals that the playbook is being executed but not landing.

  • The review produces generic statements like reduce coupling without naming where
  • Teams still cannot explain who owns what after the review
  • The answer is microservices are bad or monoliths are bad instead of context-specific
  • Common changes still cross the same boundaries but the review calls them edge cases

Outcomes

Outcomes and signals

What should exist after the playbook runs, how you'll know it worked, and what to watch for over time.

Artifacts to produce

Durable outputs the playbook should leave behind.

  • Service purpose inventory
  • Change fan-out map
  • Coupling and contract assessment
  • Ownership fit review
  • Boundary recommendation set

Success signals

Observable changes that mean the playbook landed.

  • Future changes become more local
  • Service purpose descriptions become sharper and more consistent
  • Ownership and on-call routing become clearer
  • Teams stop rediscovering the same contract and boundary problems

Follow-up actions

Moves that keep the playbook's effects compounding after it finishes.

  • Turn repeated cross-boundary friction into explicit migration or contract work
  • Review high-friction boundaries again after a few release cycles
  • Update service catalogs, onboarding docs, and dependency maps with the clearer boundary model

Metrics or signals to watch

Longer-horizon indicators that the underlying problem is receding.

  • Median number of services touched per common change
  • Cross-team coordination count per feature
  • Contract-related incident rate
  • Service ownership ambiguity events

AI impact

AI effects on this playbook

How AI-assisted and AI-driven workflows help execution, and the ways they can make it worse.

AI can help with

Where AI tooling genuinely reduces the cost of running this playbook well.

  • Summarizing change histories across repos or services
  • Mapping dependency and call patterns from code and telemetry
  • Drafting first-pass service inventories and interface summaries
  • Finding likely hidden couplings in configs, schemas, and logs

AI can make worse by

Distortions AI introduces that make the underlying problem harder to see.

  • Making weak boundaries look coherent through elegant summaries
  • Encouraging large speculative service redesigns before evidence is solid
  • Confusing generated architecture rationales with actual system understanding

Relationships

Connected playbooks

Failure modes this playbook tends to address, decisions behind the situation, red flags that motivate running it, and neighboring playbooks.