Skip to main content
The Hard Parts.dev
EP-02 Ai EP Engineering Playbook
Difficulty medium Owner · tech lead

Define safe AI development zones

Create explicit zones of safe, cautious, and restricted AI use so the team can move fast where the cost of error is low and stay deliberate where the risk is structural, legal, operational, or hard to detect.

Difficulty
medium
Time horizon
days to define, weeks to socialize and reinforce
Primary owner
tech lead
Confidence
high
At a glanceEP-02
Situation
A team wants to use AI productively without applying the same risk model to every task.
Goal
Match AI usage freedom to task risk instead of letting usage patterns evolve through informal habit.
Do not use when
the team has not yet identified any meaningful AI use cases
Primary owner
tech lead
Roles involved

tech leadengineering managersecurity or compliance partner when neededsenior engineersAI-using contributors

Context

The situation

Deciding whether to reach for this playbook: when it fits, and when it doesn't.

Use when

Conditions where this playbook is the right tool.

  • AI use is spreading unevenly across the team
  • Different engineers are making different assumptions about where AI is acceptable
  • The team wants enablement without hidden quality variance
  • Leaders want a practical operating model instead of blanket approval or blanket fear

Stakes

Why this matters

What this playbook protects against, and why skipping or half-running it tends to be expensive.

Without clear zones, teams drift into inconsistent norms. Some over-trust AI in high-risk areas, others avoid it even where it would be useful. Both create waste. Safe zones help the team move fast with less confusion and less silent risk.

Quality bar

What good looks like

The observable qualities of a team or system that is actually doing this well. Not just going through the motions.

Signs of the playbook done well

  • The team can describe where AI use is encouraged, cautious, or restricted
  • High-risk areas have stronger review or approval expectations
  • Low-risk, high-friction work gets faster without guilt or ambiguity
  • Developers know which tasks need stronger human authorship and why
  • The policy is operational enough to affect real behavior

Preparation

Before you start

What you need available and true before running the procedure. Skipping this is the most common reason playbooks fail.

Inputs

Material you'll want to gather first.

  • Team workflows
  • Code and system risk areas
  • Data sensitivity constraints
  • Review and release expectations
  • Known AI usage patterns in the team

Prerequisites

Conditions that should be true for this to work.

  • The team has enough self-awareness to describe current AI use honestly
  • Owners can identify higher-risk domains or flows
  • Leadership supports differentiated policy rather than vague encouragement

Procedure

The procedure

Each step carries its purpose (why it exists), its actions (what you do), and its outputs (what you produce). Read the purpose. It's what keeps the step from degenerating into checklist theatre.

  1. Map work by risk and detectability

    Ground the zones in engineering reality rather than opinion.

    Actions

    • List common tasks where AI is being or could be used
    • Rank them by blast radius, detectability of mistakes, and reversibility
    • Distinguish low-risk scaffolding from hard-to-detect, high-consequence logic

    Outputs

    • AI task risk map
  2. Define usage zones

    Create a practical operating model.

    Actions

    • Mark tasks as encouraged, allowed with caution, or restricted
    • Tie each zone to review expectations and ownership assumptions
    • State why each zone exists in engineering terms

    Outputs

    • AI development zones model
  3. Attach workflow rules to each zone

    Make the zones affect real work.

    Actions

    • Define what disclosure, review depth, or approval is required in each zone
    • Clarify what counts as unacceptable delegation to AI
    • Set special expectations for security-, money-, data-, or availability-critical code

    Outputs

    • Zone operating rules
  4. Teach the team with examples

    Prevent the model from remaining abstract.

    Actions

    • Show concrete examples of safe, borderline, and unsafe uses
    • Review near-miss scenarios and what zone they belong to
    • Create a lightweight cheat sheet for daily decisions

    Outputs

    • AI zone examples pack
  5. Review real usage drift

    Keep the zones aligned with evolving practice.

    Actions

    • Sample actual work across zones
    • Update rules when new tools or patterns change risk
    • Watch for shadow norms that diverge from the model

    Outputs

    • AI zone review

Judgment

Judgment calls and pitfalls

The places where execution actually diverges: decisions that need thought, questions worth asking, and mistakes that recur regardless of good intent.

Decision points

Moments where judgment and trade-offs matter more than procedure.

  • Which tasks are low-risk enough for broad AI use?
  • Which areas require caution because mistakes are subtle or high-impact?
  • Which tasks should remain strongly human-authored?
  • What usage must be disclosed or subjected to stronger review?

Questions worth asking

Prompts to use on yourself, the team, or an AI assistant while running the procedure.

  • Which tasks in this team are low-risk enough for broad AI use?
  • Where are AI mistakes hardest to notice before damage happens?
  • What work should remain strongly human-authored no matter how good the tooling looks?

Common mistakes

Patterns that surface across teams running this playbook.

  • Making zones too vague to guide real decisions
  • Treating all code generation as equally risky or equally safe
  • Ignoring detectability and focusing only on importance
  • Writing policy without examples

Warning signs you are doing it wrong

Signals that the playbook is being executed but not landing.

  • Engineers still answer 'it depends' with no shared guidance
  • High-risk areas are receiving the same AI treatment as boilerplate work
  • The team cites the policy but behaves inconsistently
  • AI use becomes more hidden after the rollout

Outcomes

Outcomes and signals

What should exist after the playbook runs, how you'll know it worked, and what to watch for over time.

Artifacts to produce

Durable outputs the playbook should leave behind.

  • AI task risk map
  • AI development zones model
  • Zone operating rules
  • AI zone examples pack
  • AI zone review

Success signals

Observable changes that mean the playbook landed.

  • The team uses AI more confidently in safe zones
  • Review and disclosure are stronger in risky zones
  • Arguments about AI use become more specific and less ideological
  • Quality surprises caused by context-inappropriate AI use decline

Follow-up actions

Moves that keep the playbook's effects compounding after it finishes.

  • Update zones when new tools alter risk and capability
  • Fold zone guidance into onboarding and review expectations
  • Promote repeated zone failures into stronger engineering controls

Metrics or signals to watch

Longer-horizon indicators that the underlying problem is receding.

  • AI usage by task class
  • Review escalation rate in high-risk zones
  • Quality incidents tied to zone misuse
  • Team understanding of zone model

AI impact

AI effects on this playbook

How AI-assisted and AI-driven workflows help execution, and the ways they can make it worse.

AI can help with

Where AI tooling genuinely reduces the cost of running this playbook well.

  • Drafting task maps and example sets
  • Clustering current AI uses into likely zone categories
  • Turning raw team practice into clearer policy structure

AI can make worse by

Distortions AI introduces that make the underlying problem harder to see.

  • Making poor policy language sound comprehensive
  • Encouraging overconfidence in borderline zones
  • Creating the illusion that documentation alone solved the norm problem

Relationships

Connected playbooks

Failure modes this playbook tends to address, decisions behind the situation, red flags that motivate running it, and neighboring playbooks.