Monolith vs Microservices · thehardparts.dev

Severity if wrong: high
Frequency: very common
Audiences: architects · staff engineers · engineering managers · platform teams
Reversibility: hard
Confidence: high

At a glanceTD-01

Really about: Coordination cost, ownership boundaries, deployment independence, and operational burden.
Not actually about: Codebase size or whether the team is modern.
Why it feels hard: Both sides have strong narratives, and teams often copy admired companies instead of matching their own stage and constraints.

The decision

Should this system remain a monolith or be split into microservices?

Usually a team-shape and operational-maturity decision disguised as an architecture preference.

Default stance

Where to start before any evidence arrives.

Prefer monolith or modular monolith unless clear pressure justifies service distribution.

Options on the table

Two poles of the trade-off

Neither is the right answer by default. Each option's conditions, strengths, costs, hidden costs, and failure modes when misused are laid out in parallel so you can read across facets.

Option A

Monolith

Best when

Conditions where this option is a natural fit.

team is small or medium-sized
domain boundaries are still evolving
speed of delivery matters more than deployment independence
operational maturity is limited

Real-world fits

Concrete environments where this option has worked.

an early-stage SaaS product with one main team
an internal business application with tightly connected workflows
a product where most changes still cut across several domains together

Strengths

What this option does well on its own terms.

lower operational overhead
simpler local development
easier refactoring across boundaries
faster early-stage delivery

Costs

What you accept up front to get those strengths.

tighter deployment coupling
harder to scale some parts independently
ownership boundaries can blur

Hidden costs

Costs that surface later than expected — the main thing novices miss.

can become socially monolithic before technically unmaintainable
teams may defer modularity too long

Failure modes when misused

How this option breaks when applied to the wrong context.

Turns into a dense system with weak boundaries and rising coordination pain.

Option B

Microservices

Best when

Conditions where this option is a natural fit.

domain boundaries are real and stable
multiple teams need independent release cadence
platform and observability maturity already exist
operational burden is affordable

Real-world fits

Concrete environments where this option has worked.

a mature platform with several teams owning distinct business capabilities
an organization with clear bounded contexts and separate deployment cadence needs
systems where independent scaling and fault isolation are routine concerns

Strengths

What this option does well on its own terms.

independent deployment
clearer service ownership
better fault isolation in some cases

Costs

What you accept up front to get those strengths.

higher operational complexity
distributed debugging overhead
network and consistency concerns

Hidden costs

Costs that surface later than expected — the main thing novices miss.

team communication often becomes the real bottleneck
service boundaries can calcify prematurely

Failure modes when misused

How this option breaks when applied to the wrong context.

Creates distributed complexity without real organizational or domain benefit.

Cost, time, and reversibility

Who pays, how it ages, and what undoing it costs

Trade-offs are rarely zero-sum and rarely static. Someone pays, the payoff curve shifts with the horizon, and the decision has an undo cost.

Cost bearer

Option A · Monolith

Who absorbs the cost

Application team
Future maintainers

Option B · Microservices

Who absorbs the cost

Platform team
Service-owning teams
Operations

Time horizon

Option A · Monolith

Usually wins early through speed and lower overhead.

Option B · Microservices

Wins later only if boundaries, teams, and platform maturity are genuinely ready.

Reversibility

What undoing costs

Hard

What should force a re-look

Trigger conditions that mean the answer may have changed.

Team count increases substantially
Deployment contention becomes routine
Bounded contexts become clearer
Operational maturity improves

How to decide

The work you still have to do

The reference can frame the trade-off; only you can weight the factors against your context.

Questions to ask

Open these in the room. Answering them is most of the decision.

Are our domain boundaries real, or are we hoping they become real later?
Is release contention actually hurting us today?
Who will own service operations, tracing, and reliability work?
What would become easier if we split, and what would become harder immediately?

Key factors

The variables that actually move the answer.

Team count
Domain stability
Release independence
Platform maturity
Observability maturity
Complexity tolerance

Evidence needed

What to gather before committing. Not after.

Deployment contention data
Dependency/coupling map
Incident and observability maturity assessment
Team topology and ownership model

Signals from the ground

What's usually pushing the call, and what should

On the left, pressures to recognize and discount. On the right, signals that genuinely point toward one option or the other.

What's usually pushing the call

Pressures to recognize and discount.

Common bad reasons

Reasoning that feels convincing in the moment but doesn't hold up.

Big companies use microservices
Monolith sounds old-fashioned
Microservices will automatically improve scalability

Anti-patterns

Shapes of reasoning to recognize and set aside.

Splitting services before ownership boundaries exist
Keeping one shared database while pretending services are independent
Using service count as a status signal

What should push the call

Concrete signals that genuinely point to one pole.

For · Monolith

Observations that genuinely point to Option A.

Frequent cross-boundary refactors
Shared delivery focus
Low ops appetite

For · Microservices

Observations that genuinely point to Option B.

Stable domain boundaries
High release contention
Teams already operate services well

AI impact

How AI bends this decision

Where AI accelerates the call, where it introduces new distortions, and anything else worth knowing.

AI can help with

Where AI genuinely reduces the cost of making the call.

AI can help map coupling, dependency hotspots, and candidate boundaries.

AI can make worse

Distortions AI introduces that didn't exist before.

AI can make service extraction look cheaper than it is by accelerating code movement without resolving boundary quality.

Relationships

Connected decisions

Nearby decisions this is sometimes confused with, adjacent decisions that are often entangled with this one, related failure modes, red flags, and playbooks to reach for.

Easy to confuse with

Nearby decisions and how this one differs.

TD-02 Modular Monolith vs Distributed Services

That decision is about how to structure code inside one deployment. This one is about whether to split deployments at all.
TD-06 Shared Database vs Service-Owned Data

That decision is about data ownership between services once they exist. This one is about whether they should exist as independent deployments in the first place.
Adjacent concept Regular refactoring across boundaries

Refactoring preserves deployment topology. This decision changes it - and makes cross-boundary refactoring fundamentally more expensive afterward.