Blog
Sep 26, 2021 - 16 MIN READ
Data Stores 101 for Architects: SQL, NoSQL, and the Shape of Consistency

Data Stores 101 for Architects: SQL, NoSQL, and the Shape of Consistency

Stop choosing databases by brand. Choose them by invariants, access patterns, and what “correct” means when the network is on fire.

Axel Domingues

Axel Domingues

In September 2021 I stopped treating “the database” as a tech choice and started treating it as a correctness boundary.

Because almost every production incident with data has the same root cause:

We scaled the system… but we didn’t scale the meaning of truth.

This post is a practical mental model for architects and tech leads:

  • When to reach for SQL
  • When NoSQL is the right tool
  • How to think about consistency without folklore
  • And why the “best database” is usually the one that matches your invariants
This is not a survey of every database category.

It’s a framework for choosing data stores in a way you can defend in a design review — and operate when reality gets loud.

What you’ll be able to do after this

Pick a data store by invariants + access patterns, not vibes.

The key lens

“Consistency” is not ideology. It’s a contract about what can never be wrong.

The common failure

Teams scale reads/writes, but forget to scale correctness and operability.

The output

A short checklist you can apply to any feature: payments, carts, feeds, search, analytics.


The Only Question That Matters: What Must Never Be Wrong?

Every system has a few sacred invariants — things the business expects to be true even on the worst day.

Examples:

  • A payment is charged at most once
  • A seat is sold at most once
  • Inventory cannot go negative
  • A user’s email is unique
  • A ledger always balances: debits == credits
  • A document version must be monotonic (no time travel)

Those invariants determine your data boundary more than:

  • your programming language
  • your ORM
  • your cloud provider
  • or what’s popular on Hacker News
Write invariants as “must never” statements.

If you can’t say what must never be wrong, you’re not choosing a database — you’re gambling.


Mental Model: Data Stores as Contracts

A data store is not “where we put data.”

It is a contract about:

  • atomicity: what changes happen together
  • isolation: what concurrent requests can observe
  • durability: what survives crashes
  • consistency: what “correct” means across time and replicas

Your job as an architect is to align that contract with your product’s invariants.


SQL vs NoSQL: The Useful Summary (Not the Meme)

“SQL vs NoSQL” is a bad debate.

A better framing is:

  • SQL (relational) is an invariant machine.
  • NoSQL (non-relational) is an access-pattern machine.

SQL is great when…

You need transactions, constraints, joins, and evolving queries.

NoSQL is great when…

You need scale-out reads/writes on predictable access patterns.

SQL hurts when…

You push it into “infinite scale” without designing partitioning and query discipline.

NoSQL hurts when…

You need cross-entity invariants and you pretend eventual consistency is “free.”

Here’s the non-negotiable truth:

If your product has money, inventory, quotas, or uniqueness constraints… you either need strong transactional semantics somewhere, or you will re-implement them poorly in application code.

Start With Access Patterns (Not Entities)

Most data models fail because we model the world (“User”, “Order”, “Product”)…

…instead of modeling how the system is actually used.

Do this first:

  • List your top queries (and their latency targets)
  • List your write paths (and their failure modes)
  • List concurrency risks (what can race?)
  • List invariants (what can never be wrong?)

Write down your “top 10 reads”

Examples:

  • “Get user dashboard in <200ms”
  • “List last 50 transactions”
  • “Search orders by email/date/status”

Write down your “top 10 writes”

Examples:

  • “Create order”
  • “Reserve inventory”
  • “Submit payment”

Mark which ones are invariants

Examples:

  • “Reserve inventory must be atomic”
  • “Payment must be idempotent”
  • “Email must be unique”

Only then pick store + model

Now the database is a tool to satisfy the contract you just defined.


SQL: Your Default Invariant Engine

Relational databases have been battle-tested for decades because they’re very good at the hard part:

maintaining truth under concurrency.

What SQL gives you “for free” (if you use it properly)

  • Atomic multi-row changes (transactions)
  • Constraints (unique, foreign keys, check constraints)
  • Flexible querying (and evolving query requirements)
  • A mature ecosystem: backups, migrations, observability, tooling
SQL is not “slow.”
  • Bad queries are slow.
  • Missing indexes are slow.
  • N+1 is slow.

The SQL superpower: constraints + isolation

If you can express an invariant as a constraint, do it.

Examples:

  • unique(email)
  • foreign keys for referential integrity
  • check(balance >= 0) (sometimes)
  • exclusion constraints (e.g., no overlapping bookings)

Why? Because the database enforces it even when your app is buggy.


NoSQL: Your Access-Pattern Engine

NoSQL is a category, not a product.

Different NoSQL systems make different tradeoffs, but the common theme is:

  • predictable access patterns
  • scale-out partitioning
  • simpler operations at large scale (in some cases)
  • and often a different (weaker) consistency contract

When NoSQL shines:

  • document reads/writes by ID
  • high write throughput with partition keys
  • event/time-series ingestion
  • caching and session storage
  • search (which is effectively a different “data store” contract)
NoSQL can be amazing when your access patterns are stable.

But when your queries evolve and you need ad-hoc joins and constraints, you’ll either bolt on more systems… or move back toward relational semantics.


The Shape of Consistency: What Can Users Observe?

Consistency is not a checkbox.

It’s a question about what can be observed after a write, across:

  • retries
  • concurrent requests
  • multiple replicas
  • network partitions
  • and time

Three “levels” teams actually operate with

  1. Strong consistency (read-your-writes, linearizable-ish behavior)
  • Great for invariants.
  • Usually costs latency/availability under partitions.
  1. Bounded staleness (eventually consistent, but within a window)
  • Great for feeds, dashboards, cached views.
  • Needs explicit UX and operational thinking (“refresh”, “pending”, “syncing”).
  1. Best-effort eventual consistency
  • Fine for analytics, logging, and non-critical projections.
  • Dangerous if used for user-facing invariants.
A useful habit: Define which parts of your system must be strong, and which can be stale.

Most good architectures are hybrid.


Pattern: Strong Core + Read Models (The “Boring Stack” Version)

A highly reliable architecture often looks like this:

  • One strong source of truth (usually SQL) for invariants
  • Read-optimized projections elsewhere for scale/UX

Examples:

  • Orders/payments in SQL, search in Elastic/OpenSearch
  • Inventory in SQL, product catalog in a document store
  • Ledger in SQL, dashboards in Redis or columnar analytics

Write path

Small, strict, transactional.
Enforces invariants.

Read path

Fast, denormalized, cached.
May be stale — by design.

This reduces the “God database” temptation while keeping truth centralized.


Where Teams Get Burned (Predictably)

1) They denormalize without a strategy

Denormalization is not wrong. It’s a performance tool.

But it creates a new job:

  • What keeps copies in sync?
  • What happens on partial failures?
  • How do you backfill?
  • How do you detect drift?

If you can’t answer those, your read model is a time bomb.

2) They confuse “availability” with “correctness”

A highly available system that occasionally sells the same seat twice is not “high availability.” It’s a lawsuit generator.

3) They treat retries as harmless

Retries are correctness events.

If you don’t plan for retries:

  • you will double-charge
  • double-send emails
  • double-create orders
  • or create ghost states
If you accept “at least once” delivery anywhere (queues, retries, timeouts), you must design idempotency at the boundary.

Practical Modeling: How to Think About Data Shape

Relational modeling guidelines

  • Normalize where it protects invariants (e.g., foreign keys, uniqueness)
  • Add indexes for critical queries (but measure write overhead)
  • Prefer explicit join tables over “clever” JSON blobs when invariants matter
  • Keep transactions small and short (avoid long locks)

Document modeling guidelines

  • Store aggregates that are usually read/written together
  • Choose a partition key you won’t regret (tenant_id, user_id, order_id)
  • Beware hot keys (one tenant, one product, one global counter)
  • Precompute what you need to query (because ad-hoc joins won’t save you)
A helpful way to think:
  • SQL models relationships.
  • Document stores model aggregates.

Isolation Levels: The Subtle Source of Bugs

Many teams say “we use SQL” and assume that means correctness is handled.

But isolation levels matter.

  • Read committed: common default; prevents some weirdness, not all.
  • Repeatable read / snapshot: better for consistency across a transaction.
  • Serializable: strongest, but can introduce retries/aborts under contention.

You don’t need to memorize these. You need to understand what races exist in your write paths.

If you have a “money button” (purchase, transfer, reserve): treat the write path as a tiny critical section.

Keep it short. Keep it transactional. Keep it auditable.


Operational Reality: The Questions Architects Must Ask

A data store choice is also an ops choice.

Ask these before you commit

Backup + restore

How fast can we restore to a point in time?
Have we practiced it?

Migrations

Can we evolve schema without downtime?
What’s our rollback story?

Replication + failover

What happens during a node failure?
What does “primary” mean?

Observability

Can we see slow queries, lock contention, replication lag?

If you can’t restore reliably, you don’t have a database.

You have a temporary cache with a very expensive API.


A Simple Decision Checklist (Use This in Design Reviews)

Print this. Put it in your PR template. Use it in planning.

Define invariants

  • What must never be wrong?
  • What is allowed to be stale?
  • What is allowed to be duplicated? (often: nothing)

Define write boundaries

  • Which actions must be atomic?
  • What is the idempotency key?
  • What is the concurrency strategy? (locks? unique constraints? optimistic versioning?)

Define top queries

  • What’s the top 10 read list?
  • What latency do you need?
  • Do queries need to evolve frequently?

Choose “strong core” vs “distributed truth”

  • Can one store hold invariants?
  • If not, what coordinates the invariant? (often: a transaction boundary + outbox/eventing)

Define the ops plan

  • Backups + restore drills
  • Migration strategy
  • Monitoring and alerting
  • Cost model (IO, storage growth, index bloat)

Resources

Designing Data-Intensive Applications (Kleppmann)

The best mental model book for replication, consistency, and distributed tradeoffs.

Jepsen Analyses

Reality checks for distributed systems guarantees. Read when you’re tempted to assume.

PostgreSQL Docs

The reference for isolation, indexes, and operational behavior in a popular SQL engine.

MongoDB Docs (Modeling)

A clear explanation of aggregate/document modeling and tradeoffs.


FAQ


What’s Next

This month was about picking a data store with adult supervision:

  • invariants first
  • access patterns second
  • ops reality always

Next month I’m zooming into the performance layer most teams misuse: Caching

Because once your data model is honest… your cache needs to be honest too.

Axel Domingues - 2026