Make an integration contract explicit

Difficulty: medium-high
Time horizon: days to weeks
Primary owner: producer owner
Confidence: high

At a glanceEP-11

Situation: A producer and consumer depend on behavior that is currently implicit or informal.
Goal: Reduce integration fragility by moving from assumption-based compatibility to clear, owned contract behavior.
Do not use when: the integration is genuinely throwaway and low-risk
Primary owner: producer owner
Roles involved: producer ownerconsumer ownertech lead or architectQA or contract test owner where applicableplatform owner if the interface is shared broadly

Context

The situation

Deciding whether to reach for this playbook: when it fits, and when it doesn't.

Use when

Conditions where this playbook is the right tool.

Consumers rely on undocumented behavior
Breaking changes are discovered late downstream
API or event semantics are argued from examples rather than rules
The producer says we never promised that and the consumer says we rely on it

Stakes

Why this matters

What this playbook protects against, and why skipping or half-running it tends to be expensive.

Integrations become dangerous when teams act as if only the syntax is shared. Real contracts include field meaning, timing, ordering, failure semantics, compatibility promises, and operational expectations.

Quality bar

What good looks like

The observable qualities of a team or system that is actually doing this well. Not just going through the motions.

Signs of the playbook done well

Both producer and consumer can describe the same contract similarly
Compatibility expectations are visible and owned
Undocumented assumptions are reduced
Contract change becomes easier to discuss and safer to execute
Consumers learn about breaking changes before production

Preparation

Before you start

What you need available and true before running the procedure. Skipping this is the most common reason playbooks fail.

Inputs

Material you'll want to gather first.

Current API, event, or data interface
Consumer usage patterns
Recent breakage or ambiguity examples
Existing docs and schemas
Ownership model

Prerequisites

Conditions that should be true for this to work.

Producer and consumer representatives are present
The interface can be inspected in real usage, not only in theory
Someone owns compatibility decisions

Procedure

The procedure

Each step carries its purpose (why it exists), its actions (what you do), and its outputs (what you produce). Read the purpose. It's what keeps the step from degenerating into checklist theatre.

01
Expose actual consumer dependence
Find the contract that already exists in practice.
Actions
- Review how consumers use the interface in real workflows
- Identify assumptions about fields, timing, ordering, idempotency, and failure behavior
- Separate guaranteed behavior from accidental producer behavior
Outputs
- Actual contract map
02
Define explicit guarantees
Turn ambiguity into shared rules.
Actions
- Document what is stable, what may change, and what is out of contract
- Clarify backward compatibility policy
- Decide how deprecated or optional behaviors are handled
Outputs
- Contract definition
- Compatibility policy
03
Add verification paths
Make the contract enforceable, not merely narrated.
Actions
- Introduce or tighten contract tests, schema validation, and compatibility checks
- Ensure test ownership is clear
- Cover consumer-critical behaviors, not just type correctness
Outputs
- Contract verification plan
04
Define change communication and versioning
Reduce downstream surprise.
Actions
- Name how consumers are informed about changes
- Set versioning or change announcement expectations
- Clarify timelines for deprecation and removal
Outputs
- Change management model
05
Review drift periodically
Prevent the contract from becoming implicit again.
Actions
- Review new consumer assumptions regularly
- Check whether docs still match runtime behavior
- Capture breakage or near-miss lessons
Outputs
- Contract drift review

Judgment

Judgment calls and pitfalls

The places where execution actually diverges: decisions that need thought, questions worth asking, and mistakes that recur regardless of good intent.

Decision points

Moments where judgment and trade-offs matter more than procedure.

What behavior do we truly guarantee?
What existing consumer dependence must be honored for now?
What is the right compatibility policy for this interface?
How much verification is needed given the impact and blast radius?

Questions worth asking

Prompts to use on yourself, the team, or an AI assistant while running the procedure.

What behavior are consumers depending on that we never formally declared?
What exactly is guaranteed versus incidental?
How will downstream teams know a change is safe or breaking?

Common mistakes

Patterns that surface across teams running this playbook.

Documenting only syntax and calling the contract done
Declaring undocumented behavior out of scope after consumers already depend on it
Treating internal APIs as exempt from contract discipline
Adding versioning theater without compatibility ownership

Warning signs you are doing it wrong

Signals that the playbook is being executed but not landing.

Producer and consumer still describe the interface differently after the exercise
Breaking changes are still found downstream by surprise
Contract tests pass but operational expectations still break
Nobody can say who decides whether a change is compatible

Outcomes

Outcomes and signals

What should exist after the playbook runs, how you'll know it worked, and what to watch for over time.

Artifacts to produce

Durable outputs the playbook should leave behind.

Actual contract map
Contract definition
Compatibility policy
Contract verification plan
Change management model

Success signals

Observable changes that mean the playbook landed.

Fewer surprises during integration changes
Producer and consumer discussions get shorter and more precise
Compatibility decisions become explicit
Incident patterns shift away from accidental contract breakage

Follow-up actions

Moves that keep the playbook's effects compounding after it finishes.

Review frequently violated contracts for boundary redesign
Update consumer onboarding and producer docs with the explicit model
Audit where other interfaces are still running on accidental behavior

Metrics or signals to watch

Longer-horizon indicators that the underlying problem is receding.

Contract-related incident count
Consumer surprise change events
Time to resolve integration ambiguity
Rate of compatibility disputes

AI impact

AI effects on this playbook

How AI-assisted and AI-driven workflows help execution, and the ways they can make it worse.

AI can help with

Where AI tooling genuinely reduces the cost of running this playbook well.

Comparing docs, code, and real interface usage for mismatches
Drafting first-pass contract descriptions from code and examples
Highlighting likely undocumented semantics in consumer usage

AI can make worse by

Distortions AI introduces that make the underlying problem harder to see.

Producing polished contract docs that still miss actual runtime semantics
Encouraging reliance on inferred examples over explicit guarantees
Masking disagreement with overly tidy summaries

Relationships

Connected playbooks

Failure modes this playbook tends to address, decisions behind the situation, red flags that motivate running it, and neighboring playbooks.