SQL vs NoSQL · thehardparts.dev

Severity if wrong: high
Frequency: very common
Audiences: architects · backend engineers · data-heavy product teams
Reversibility: hard
Confidence: high

At a glanceTD-04

Really about: Access patterns, relational complexity, consistency needs, and operational familiarity.
Not actually about: Which datastore category is more modern or web-scale by default.
Why it feels hard: SQL feels structured; NoSQL feels flexible until querying, consistency, and governance mature into pain.

The decision

Should this data model live in a relational or non-relational store?

Usually a query-shape and consistency decision, not a scalability slogan decision.

Default stance

Where to start before any evidence arrives.

Prefer SQL unless access patterns and domain shape clearly favor non-relational storage.

Options on the table

Two poles of the trade-off

Neither is the right answer by default. Each option's conditions, strengths, costs, hidden costs, and failure modes when misused are laid out in parallel so you can read across facets.

Option A

SQL

Best when

Conditions where this option is a natural fit.

relationships matter
querying is important
consistency matters strongly
schema discipline is valuable

Real-world fits

Concrete environments where this option has worked.

financial systems
business applications with reporting needs
domains with rich joins and transactional integrity requirements

Strengths

What this option does well on its own terms.

rich querying
transactional integrity
mature tooling
clear structure

Costs

What you accept up front to get those strengths.

schema evolution may feel slower
horizontal scaling can be more complex in some cases

Hidden costs

Costs that surface later than expected — the main thing novices miss.

bad relational modeling can create performance pain
teams may misuse SQL as an excuse to delay domain design

Failure modes when misused

How this option breaks when applied to the wrong context.

Creates dense relational coupling and poor performance through weak modeling.

Option B

NoSQL

Best when

Conditions where this option is a natural fit.

access patterns are simple and well-known
document or key-value shape matches domain reality
scalability patterns favor denormalized models
query flexibility is less important

Real-world fits

Concrete environments where this option has worked.

document-centric content systems
high-scale key-value workloads
event/session-like data with simple primary access paths

Strengths

What this option does well on its own terms.

schema flexibility
simple scaling models in some workloads
good fit for certain document-centric domains

Costs

What you accept up front to get those strengths.

weaker ad hoc querying
denormalization overhead
consistency and duplication trade-offs

Hidden costs

Costs that surface later than expected — the main thing novices miss.

teams often recreate relational needs later
read model convenience can become write model complexity

Failure modes when misused

How this option breaks when applied to the wrong context.

Creates flexible storage with rigid downstream reporting and consistency pain.

Cost, time, and reversibility

Who pays, how it ages, and what undoing it costs

Trade-offs are rarely zero-sum and rarely static. Someone pays, the payoff curve shifts with the horizon, and the decision has an undo cost.

Cost bearer

Option A · SQL

Who absorbs the cost

Backend team
DBA/data engineers

Option B · NoSQL

Who absorbs the cost

Application teams
Future reporting/analytics teams

Time horizon

Option A · SQL

Often wins long-term when query flexibility and reporting become unavoidable.

Option B · NoSQL

Wins when the access model stays narrow and the workload truly fits it.

Reversibility

What undoing costs

Hard

What should force a re-look

Trigger conditions that mean the answer may have changed.

Query patterns change materially
Scale characteristics shift
Reporting requirements grow

How to decide

The work you still have to do

The reference can frame the trade-off; only you can weight the factors against your context.

Questions to ask

Open these in the room. Answering them is most of the decision.

What are the real access patterns, not the imagined ones?
Will reporting and ad hoc queries matter later?
What inconsistency can the domain tolerate?
Are we denormalizing because it helps, or because we are avoiding schema thinking?

Key factors

The variables that actually move the answer.

Query complexity
Relationship richness
Consistency needs
Scale profile
Team familiarity

Evidence needed

What to gather before committing. Not after.

Access pattern inventory
Query/reporting needs analysis
Consistency requirement mapping
Growth and scale projections

Signals from the ground

What's usually pushing the call, and what should

On the left, pressures to recognize and discount. On the right, signals that genuinely point toward one option or the other.

What's usually pushing the call

Pressures to recognize and discount.

Common bad reasons

Reasoning that feels convincing in the moment but doesn't hold up.

NoSQL scales infinitely
SQL is old-school
We do not know schema yet so schema should not exist

Anti-patterns

Shapes of reasoning to recognize and set aside.

Choosing NoSQL because of anticipated scale without query analysis
Using SQL for strongly document-shaped access patterns while rebuilding denormalized views everywhere

What should push the call

Concrete signals that genuinely point to one pole.

For · SQL

Observations that genuinely point to Option A.

Rich queries
Joins matter
Reporting matters

For · NoSQL

Observations that genuinely point to Option B.

Document-shaped domain
Simple known access paths
Denormalization is acceptable

AI impact

How AI bends this decision

Where AI accelerates the call, where it introduces new distortions, and anything else worth knowing.

AI can help with

Where AI genuinely reduces the cost of making the call.

AI can analyze access patterns and generate candidate schema/query comparisons.

AI can make worse

Distortions AI introduces that didn't exist before.

AI can suggest trendy datastore choices detached from real query and consistency needs.

Relationships

Connected decisions

Nearby decisions this is sometimes confused with, adjacent decisions that are often entangled with this one, related failure modes, red flags, and playbooks to reach for.

Easy to confuse with

Nearby decisions and how this one differs.

TD-06 Shared Database vs Service-Owned Data

That decision is about who owns the data. This one is about how the data is modeled and queried regardless of ownership.
TD-07 Strong Consistency vs Eventual Consistency

That decision is about guarantees on reads. This one is about the store underneath, which may or may not constrain those guarantees.
Adjacent concept A hosting-provider decision

Choosing RDS vs Dynamo vs Mongo Atlas is about operations. This decision is about the data model, which constrains the provider set but is not the same choice.