
APIs don’t fail because they’re slow — they fail because they change. This month is about designing contracts you can evolve, enforcing compatibility automatically, and scaling teams without “everyone upgrade on Tuesday.”
Axel Domingues
APIs are the part of your system you don’t get to rewrite.
Not because it’s impossible.
Because the moment you have:
“Just update everyone” becomes a fairy tale.
This month is about API evolution as architecture:
The bigger your org gets, the more API design becomes a governance problem, not a coding problem.the hard part isn’t building APIs — it’s evolving them.
The real goal
Ship changes without requiring synchronized upgrades across teams.
The core mechanism
Turn “what we meant” into an explicit contract (and test it continuously).
The maturity marker
Compatibility stops being a best-effort promise and becomes an enforced invariant.
The payoff
Teams can move independently, and rollbacks stay possible because clients don’t explode.
In early-stage systems, an API is “a function call over the network”.
In mature systems, an API is:
If the API breaks, the system doesn’t “degrade.”
It fractures:
So the architectural move is to treat your API like a product:
Most API work is about being backward compatible, because servers roll out continuously while clients lag.
Wire compatibility is necessary.
Semantic compatibility is where incidents are born.
Most “breaking changes” are disguised as “small refactors.”
When systems are small, both sides can “agree”.
When systems scale, agreement becomes expensive — so your API must be resilient to drift.
A practical pattern:
clients will be older than servers
and servers will see shapes they didn’t predict.
This is the checklist I want teams to internalize.
Never “reuse” a field
If a field’s meaning changes, create a new field.
Reusing names creates invisible breakage.
Prefer optional + additive
Add fields, don’t mutate fields.
Old clients ignore what they don’t understand.
Treat validation as a breaking change
Making a previously-accepted input invalid is a real breaking change.
Roll it out like one.
Errors are part of the contract
Status codes, error codes, and retry semantics must be stable.
Changing them breaks clients just as hard as schema changes.
Here are classic “small improvements” that caused large incidents:
X-Customer-Id” (clients don’t send it everywhere)null instead of missing fields” (deserializers differ)Branching logic comes from:
Versioning is useful — but it’s also an easy escape hatch:
“We’ll just bump v2.”
The problem is that versions accumulate.
Old versions don’t disappear; they become a permanent support tax.
A more scalable stance:
Examples:
A version should exist because it unlocks a better long-term contract — not because it’s convenient for the server team.
Most teams have “API docs”.
Few teams have an API contract.
A contract is something you can:
Examples:
The architectural goal is simple:
Turn expectations into artifacts that your pipeline can enforce.
When you only have provider-side tests, you test what you think clients do.
Consumer-driven contract testing flips it:
This isn’t a testing trick. It’s organizational architecture.
Producer’s blind spot
Provider tests can’t predict what clients depend on “by accident”.
Consumer’s reality
Consumers encode the exact requests/responses they rely on.
No guessing. No meetings.
Automated enforcement
Providers verify against consumer contracts in CI.
Compatibility becomes a gate.
Independent deployment
Teams ship independently because the pipeline catches contract drift early.
A minimal Change Data Capture (CDC) loop looks like this:
GET /quotes/{id}, I expect status 200 and fields X/Y/Z.”Event-driven systems often “move faster” — until an event schema changes and downstream consumers explode silently.
Event evolution has all the same problems as HTTP evolution, plus:
So treat event payloads as a first-class contract:
Replays are where truth is tested.
“Deprecated” is meaningless if nobody knows and nobody cares.
Deprecation is a system:
Here’s the minimal viable playbook:
If you’re reviewing an API change, ask these questions:
If any answer is “yes”, treat it like a breaking change rollout.
No artifact = no enforcement.
Surprises belong in CI, not in incidents.
If you can’t answer “who uses this”, you can’t deprecate safely.
OpenAPI Specification
A contract format for HTTP APIs — the foundation for diffing, validation, and generated clients.
Sometimes — but don’t treat URL versioning as the default.
If every change becomes a new version, versions never die and you end up supporting a museum.
Prefer additive evolution within a version, and reserve major versions for fundamental semantic redesigns.
No. It’s for independent deployments.
If consumers and providers deploy separately (mobile apps, partner integrations, internal services), CDC can turn coordination into automation.
Changing meaning without changing shape.
Renaming is obvious and gets flagged.
Semantic drift is quiet and causes production incidents because clients still deserialize successfully — they just behave wrong.
Now that we can evolve contracts safely, the next pressure point is performance.
Because once you have multiple services and multiple teams, the “average latency” stops mattering — and tail latency becomes your true user experience.
Next month: Performance Engineering End-to-End.
Performance Engineering End-to-End: From TTFB to Tail Latency
Performance isn’t a tuning phase — it’s an architecture property. This month I lay out an end-to-end mental model (browser → edge → app → data) and the practical playbook for improving both “fast on average” and “fast under load” without shipping fragile optimizations.
Distributed Data: Transactions, Outbox, Sagas, and “Eventually Correct”
Once your system crosses a process boundary, “a transaction” stops being a feature and becomes a strategy. This post is a practical mental model for distributed data: what to keep strongly consistent, what to make eventually consistent, and how to do it safely with outbox + sagas.