Skip to main content
The Hard Parts.dev
RF-06 Code · Delivery RF Red Flags
Severity high Freq very common

Tests are hard to write for normal changes

Ordinary work feels harder to test than it should, even when the change itself is not unusual.

Severity
high
Frequency
very common
First noticed by
developers · tech leads · QA
Detectability
visible-if-you-look
Confidence
high
At a glanceRF-06
Where you see this

legacy monolithscontroller-heavy service codesystems with hidden external dependencies

Not necessarily a problem when
you are testing an unusually integration-heavy legacy surface during an explicitly temporary transition
Often mistaken for
testing is just always expensive in serious systems
Time horizon
near-term
Best placed to act

tech leadmodule owner

The signal

What you would actually notice

Poor testability is usually a proxy for poor design, excessive coupling, or hidden runtime assumptions.

Field observation

Developers avoid adding tests, require huge setup, or rely on manual verification for routine changes.

Also observed

  • This is a small fix, but I cannot write a sane test for it.
  • We will validate this manually in staging.

Primary reading

What it usually indicates

Most likely underlying patterns when this signal shows up. Not a diagnosis, a starting hypothesis.

Usually indicates

Most likely underlying patterns when this signal shows up.

  • tight coupling
  • side-effect-heavy design
  • missing seams
  • environment-dependent behavior

Stakes

Why it matters

Poor testability is usually a proxy for poor design, excessive coupling, or hidden runtime assumptions.

Inspection

What to check next

Deliberate steps to confirm or disconfirm the primary reading above. Not a checklist. An order of inspection.

  1. test setup complexity
  2. dependency injection or seam quality
  3. external dependency map

Diagnostic questions

Questions to ask the team, or yourself, before concluding anything.

  1. What makes this change hard to test?
  2. Is the pain from tooling, or from structure?
  3. Where are the missing seams?

Progression

Under the signal

Where this pattern tends to come from, what's holding it up, and where it goes if nothing changes.

Leading indicators

What tends to show up first.

  • test setup is larger than the change
  • developers skip tests for speed
  • the team depends heavily on QA or staging confidence

Common root causes

What is usually sitting under the signal.

  • bad modularity
  • global state
  • heavy side effects
  • unclear boundaries

Likely consequences

What happens if nothing changes.

  • low confidence
  • manual regression burden
  • slower delivery

Look-alikes

Not what it looks like

Patterns that can be mistaken for this signal, and 'fix' attempts that make it worse.

False friends Things the signal is often confused with, but isn't.
  • testing is just always expensive in serious systems

Anti-patterns when responding

Responses that feel sensible and usually make the underlying pattern worse.

  • accepting low testability as normal for the codebase
  • adding only high-level tests to compensate for poor design

Context

Context and ownership

Where this signal surfaces, who sees it first, who can actually act, and how much runway there usually is before escalation.

Common contexts

Where it shows up

  • legacy monoliths
  • controller-heavy service code
  • systems with hidden external dependencies
Most likely to notice

Who sees it first

Before it escalates.

  • developers
  • tech leads
  • QA
Best placed to act

Who can move on it

Not always the same as who notices it.

  • tech lead
  • module owner
Time horizon

near-term

How much runway there usually is before the signal hardens into the underlying pattern.

AI impact

AI effects on this signal

How AI-assisted and AI-driven workflows tend to amplify or hide this signal.

AI amplifies

Ways AI tooling tends to make this signal louder or more common.

  • AI can generate tests that compile but do not reduce the real pain of a poorly testable structure.

AI masks

Ways AI tooling tends to hide this signal, so it keeps growing under the surface.

  • Test count rises while true confidence does not.

Relationships

Connected signals

Related failure modes, decisions behind the signal, response playbooks, and neighboring red flags.