Feb 28, 2021 - 16 MIN READ

HTTP as a Distributed Systems API (Without the Buzzwords)

HTTP isn’t just “how browsers talk to servers.” It’s a mature distributed-systems contract with semantics for caching, retries, concurrency, intermediaries, and evolution. If you design APIs without those semantics, production will teach you them anyway.

Axel Domingues

Last month we talked about the web’s “compression algorithm”: how the industry keeps moving work between server → client → server again (SSR) → edge to satisfy new constraints.

This month is the substrate that makes every one of those eras possible:

HTTP.

Not as “requests and responses.”

As a distributed systems protocol that:

runs through multiple intermediaries you don’t control
can be cached, replayed, delayed, reordered, and partially observed
will fail in ways that look like your bug, even when they aren’t
and has a set of semantics that—when respected—make systems boringly reliable

If you ignore those semantics, you don’t get to avoid them.

You just get to learn them during an incident.

Goal of this post

Give you a practical mental model and a set of checklists so you can design HTTP APIs that survive:

real networks
real clients
real caches and proxies
real rollouts
real security constraints

HTTP, in one sentence

HTTP is a message protocol with semantics designed for an ecosystem of intermediaries.

That ecosystem matters.

Because your “client → server” diagram is almost never true in production.

HTTP request path with intermediaries: browser, local cache, CDN, reverse proxy, load balancer, API gateway, service, database

The mental model: the HTTP request graph

When you send a request, you don’t hit “the server.”

You traverse a graph.

browser cache
service worker (maybe)
corporate proxy (maybe)
CDN
WAF / bot protection
load balancer
reverse proxy
API gateway
your service
upstream services
caches
databases

Each hop can:

change latency and timeouts
cache responses
block or rewrite headers
retry or terminate connections
apply rate limits
record logs (or not)

So the right question isn’t:

“What does my server do when it receives a request?”

It’s:

“What semantics does my system promise when the request graph behaves like the real internet?”

Mini-glossary (the only words we’ll use)

If you’re thinking “this is distributed systems vocabulary,” yes.

That’s the point.

The four semantics HTTP gives you “for free” (if you respect them)

HTTP isn’t just bytes over TCP.

It gives you semantics that you can build systems on top of—if you stop fighting them.

Method semantics

Safety + idempotency determine what can be retried, cached, and replayed.

Caching semantics

Cache-Control, validators (ETag), and Vary let you scale reads without lying.

Status semantics

Status codes tell clients and intermediaries what happened and what to do next.

Content semantics

Representation, negotiation, and headers let you evolve contracts without breaking everyone.

In practice, teams mostly learn these semantics only after they shipped an API that violates them.

Let’s not do that.

1) Method semantics: the fastest way to avoid duplicate money

Here’s the rule architects keep in their pocket:

The retry rule

If an operation can be retried, it must be safe or idempotent (or have an idempotency key).

Why? Because timeouts aren’t “didn’t happen.”

Timeout means:

the client didn’t get a response in time
the server may still be processing
the response may have been lost
an intermediary may retry

So if you design an endpoint like this:

POST /charge-card

…and it’s not idempotent, you just wrote a payment duplication bug.

A practical method guide (that survives meetings)

If your team argues about REST purity, reframe the discussion:

We’re not choosing verbs for style.
We’re choosing semantics that define retry safety, caching, and operability.

2) Caching semantics: how you scale reads without inventing chaos

Caching is not an optimization.

Caching is a feature that:

changes correctness
changes cost
changes failure modes

HTTP gives you a caching language. Use it.

The caching layers you actually have

Browser cache

Fastest path. But can be polluted by auth and shared devices if mishandled.

CDN / edge cache

Your biggest scalability lever for public content and stable resources.

Reverse proxy cache

Great for shielding origins, but needs strong cache key discipline.

Application cache

Redis/memory caches. Powerful, but now you own invalidation and coherence.

A safe default for caching posture

If you’re unsure, start with:

Public content: cacheable, explicit TTL, validators, safe Vary
Personalized/auth content: Cache-Control: private or no-store until proven safe

The most expensive caching bug is not “stale data.”

It’s serving the wrong user’s data. That’s how performance incidents become security incidents.

Conditional requests (ETag) are the underrated superpower

Conditional requests let clients and CDNs ask:

“Has this changed?”

Without refetching the whole payload.

That’s how you reduce bandwidth and improve perceived speed while keeping correctness.

The practical win

ETags + 304 responses are a bandwidth and latency reducer that also improves stability under load.

3) Status semantics: a contract between you and the ecosystem

Status codes aren’t for humans.

They’re signals to:

clients
SDKs
proxies
caches
retriers
observability tools

A few status patterns separate “clean systems” from “painful systems.”

2xx — success with meaning

Return 201 for creation, 202 for accepted async work, 204 for empty success.

4xx — client action required

Use 400/401/403/404/409/422 intentionally to communicate what to fix.

5xx — server fault (and retriable signals)

Differentiate “try again later” vs “we’re broken” with your error policy.

429 — throttling

Rate limits need contracts (headers, retries, budgets), not surprise failures.

The two status codes that unlock production systems

202 Accepted: “We accepted the request, work continues asynchronously.”
This is how you avoid holding connections open for long workflows.
409 Conflict: “Your request is valid, but conflicts with current state.”
This is how you avoid silent overwrites and race-condition chaos.

Most “API reliability” problems are actually “semantic clarity” problems.

If clients can’t distinguish:

invalid request
unauthorized
conflict
temporary failure
permanent failure

…they will implement retries and fallbacks that make outages worse.

4) Timeouts, retries, and the lie of “network errors”

HTTP failures are not binary.

They’re ambiguous.

The production truth

The client times out.
The server might still finish.
The client retries.
The server processes twice.
You get a “random” duplicate.
Everyone blames the database.

This is why method semantics + idempotency are architecture concerns.

The “retry budget” pattern

Retries should be designed like spending money:

you have a budget
you have rules
you stop before you bankrupt the system

Retry design rule

Retries must have a limit, backoff, and jitter—or they become a denial-of-service you wrote yourself.

Step 1 — Set deadlines, not just timeouts

A single hop timeout is not enough.

Define a deadline for the whole request path (client → edge → origin → upstreams). Every layer must honor it.

Step 2 — Retry only when you can prove it’s safe

Safe/idempotent methods can retry (GET/PUT/DELETE by semantics)
POST/PATCH must be designed for retries (idempotency keys)
Never retry “unknown” failures blindly

Step 3 — Backoff + jitter

If thousands of clients retry on the same schedule, you create synchronized load spikes.

Backoff spreads load. Jitter prevents thundering herds.

Step 4 — Observe the retry rate

Retry rate is an SLO smell. If it climbs, you’re masking a deeper reliability issue.

The retry storm anti-pattern

When a dependency slows down:

clients retry
traffic multiplies
dependency slows down more
the system collapses

The fix is not “more retries.”The fix is budgets, backoff, and often circuit breaking (we’ll cover resilience deeply later in the series).

Design your API like an ecosystem, not an endpoint

Now we combine the pieces into a practical design posture.

The architecture posture

Design HTTP APIs assuming intermediaries exist, clients retry, caches cache, and failures are ambiguous.

A production-grade HTTP API checklist

The failure modes you’ll see in production (and what they usually mean)

These are the ones I keep seeing across teams and companies.

The takeaway: HTTP is already the contract you need

A lot of “microservices architecture” debates happen because teams skip fundamentals.

But HTTP already solved many system-level concerns at the protocol level:

caching
intermediaries
semantics for safe retries
signals via status codes
evolvable representation patterns

When you respect HTTP semantics, your system becomes easier to:

scale
debug
evolve
and operate

When you ignore them, you’ll re-invent them—badly—inside application code.

February takeaway

HTTP is not “transport.” It’s distributed systems semantics that you either use intentionally or learn painfully.

Resources

MDN — HTTP Overview

A pragmatic, developer-friendly guide to methods, headers, caching, and status codes.

HTTP Semantics (RFC 7231)

The canonical semantics: methods, status codes, negotiation—useful when you need “the source of truth”.

HTTP Caching (RFC 7234)

How caching is supposed to work. Essential reading if you run CDNs or reverse proxies.

REST API Design Guidelines (Microsoft)

A practical design guide that focuses on consistency and long-term maintainability.

FAQ

What’s Next

HTTP is the contract.

Next month we talk about the runtime that consumes it:

Browser Reality

Because once you understand HTTP semantics, your next most common production failures are no longer “API bugs.”

They’re timing and scheduling bugs—in the browser.

Browser Reality: The Event Loop, Rendering, and Why UX Bugs Look Like Backend Bugs

The browser is a constrained runtime with a scheduling problem: one main thread, many responsibilities, and users who notice missed frames. This post gives you the mental model to debug “random” UX failures as deterministic timing and contention issues.

The Web's "Compression Algorithm": Static → Web 2.0 → SPA → SSR/Edge

The web didn’t evolve because developers got bored. It evolved because latency, state, and economics kept forcing us to move responsibility between server, client, and edge. This post gives you the mental model and the checklist to choose the right rendering architecture in 2021+