Red Flags
Early-warning signals across code, teams, process, leadership, and AI-enabled work: what you notice, what it usually means, and what to check next.
Entries
42
Classes
05
Classes
- code
- team
- process
- leadership
- ai
Every entry is a named signal, not a diagnosis. The page opens with what you would actually notice, and walks you through what it usually indicates and what to check next. Scanning for a pattern you've half-seen in your own work should land you on the right entry in under a minute.
| Layer ↓ Signal type → | Structural 03 | Behavioral 11 | Delivery 09 | Communication 04 | Architectural 04 | Operational 02 | Ai Quality 09 | Total 42 |
|---|---|---|---|---|---|---|---|---|
| Code 10 | 03 | 02 | 04 | 01 | 10 | |||
| Team 08 | 05 | 02 | 01 | 08 | ||||
| Process 10 | 01 | 06 | 01 | 02 | 10 | |||
| Leadership 07 | 05 | 01 | 01 | 07 | ||||
| Ai 07 | 07 | 07 |
-
Severity key
- low
- medium
- medium-high
- high
- critical
Chip = card fill on the grid.
-
Frequency
How often this red flag actually shows up across teams: from occasional cameos to chronic patterns.
increasing Not a point on the scale. A trend. Flags signals whose prevalence is rising (often AI-era).
- rare
- occasional
- common
- very common
- universal
-
Detectability
How easy the signal is to miss: from obvious if you look once, to quietly normalized by the organization.
- obvious
- visible-if-you-look
- subtle
- easy-to-normalize
-
Confidence
How sure we are this reading of the signal holds up across contexts: provisional vs. repeatedly observed.
- low
- medium
- medium-high
- high
Code
10 signalsNobody can explain this module simply
A module performs important work, but nobody can describe its purpose in plain language without hand-waving.
One file does too much
A single file accumulates unrelated responsibilities and becomes a local gravity well.
Changes always touch too many places
Even ordinary changes require edits across many files, layers, or services.
Naming is generic where understanding is weak
As conceptual clarity drops, code and design names become broader and more vague.
Business logic leaks across layers
Rules and decisions show up in controllers, mappers, jobs, UI, SQL, and integrations instead of one clear home.
Tests are hard to write for normal changes
Ordinary work feels harder to test than it should, even when the change itself is not unusual.
End-to-end tests carry all the confidence
The team relies mainly on slow, broad tests because lower-level confidence is weak or absent.
Shared utility layer grows faster than products
The common or shared layer expands aggressively while real product or domain code remains less stable or less clear.
Integration contracts are implicit
Systems depend on each other through assumptions that are not clearly documented, versioned, or enforced.
Generated code is merged without deep review
Code enters production mainly because it looks plausible, not because reviewers truly understand it.
Team
08 signalsEveryone asks the same person
One person becomes the default source of truth, escalation path, or decision gateway for too many important areas.
Critical knowledge lives in chat and memory
Important operational or architectural knowledge exists mainly in people's heads or scattered chat history.
People avoid touching certain areas
Parts of the system become socially dangerous, so engineers avoid them unless forced.
PRs are approved faster than they are understood
Review speed outruns review depth, so approvals become a workflow ritual rather than a quality mechanism.
Ownership is claimed but not visible
People say an area is owned, but ownership is not observable in decisions, maintenance, documentation, or response patterns.
No one disagrees in meetings, everyone complains later
Visible consensus in formal settings hides real disagreement that emerges only afterward in side channels.
The loudest person wins architecture discussions
Architecture outcomes follow assertiveness, seniority, or force of personality more than evidence and context.
AI use is widespread but norms are unclear
People use AI heavily, but the team lacks shared rules about where it is safe, expected, or dangerous.
Process
10 signalsWork enters faster than it leaves
Incoming work volume consistently outpaces completion, so queues, context switching, and churn grow silently.
Everything is urgent
Priority loses meaning because too many items are treated as immediate and exceptional at the same time.
Scope changes without decision records
Meaningful scope shifts happen, but nobody captures who decided, why, or what trade-off was accepted.
Dates are fixed but trade-offs are implicit
A delivery date is treated as immovable, but the corresponding trade-offs in scope, quality, or risk are not stated openly.
Tickets substitute for thinking
The organization mistakes ticket flow for real design clarity, prioritization, and problem understanding.
Metrics are visible but not trusted
Dashboards, KPIs, and delivery metrics exist, but people do not believe they reflect reality well enough to act on them confidently.
Retrospectives repeat the same outputs
Teams keep naming the same problems in retros, but little structural change follows.
Release confidence depends on luck and timing
Teams ship when the stars align rather than because the release process gives genuine confidence.
Dependencies are discovered late every cycle
Important cross-team or cross-system dependencies are found after work is already underway, not during planning or framing.
Teams cannot explain what done means
Completion criteria are vague enough that teams, stakeholders, and reviewers mean different things by 'done.'
Leadership
07 signalsReporting looks healthier than delivery feels
Dashboards, status updates, and leadership narratives stay calm and positive while the teams doing the work feel far more fragility and risk.
Teams are measured on output, not outcome
Success is tracked through activity volume, ticket throughput, or artifact production more than through user, business, or system outcomes.
Ownership and authority do not match
Teams or individuals are held accountable for outcomes they do not have enough control to influence.
Platform work has no users, only sponsors
Internal platform or foundation work is funded and praised politically, but lacks real, visible, engaged adopters.
Risks are acknowledged but never priced into plans
People discuss risks intelligently, but schedules, scope, staffing, and commitments do not change to reflect them.
Governance exists mainly as ceremony
Governance structures consume time and artifacts but have weak effect on actual risk, quality, or decision quality.
Nobody can say what the company will stop doing
Priorities accumulate, but explicit stopping decisions are rare or absent.
Ai
07 signalsMore output, less certainty
The volume of produced artifacts increases, but confidence in correctness, design quality, or shared understanding does not keep pace.
AI-generated artifacts are trusted more than source material
Summaries, synthesized docs, or generated analyses start becoming the operational truth instead of pointers back to real sources.
Prompt changes replace system thinking
Teams keep tuning prompts when the real problem is workflow design, source quality, evaluation, or tool structure.
Benchmarks are discussed more than real user outcomes
Teams spend more time on benchmark scores and synthetic eval wins than on whether the system helps real users in real tasks.
Humans in the loop are rubber stamps
Human review exists in the workflow, but the review step does not preserve meaningful judgment.
RAG uses sources nobody actually trusts
Retrieval-based systems cite or use sources that are stale, low quality, politically loaded, or not actually treated as authoritative by humans.
Model drift is noticed informally, not measured
People sense that model or system behavior changed, but the organization lacks reliable measurement, alerts, or structured comparison.