Blog
Sep 29, 2024 - 16 MIN READ
Regulation as Architecture: Turning the EU AI Act into Controls and Evidence

Regulation as Architecture: Turning the EU AI Act into Controls and Evidence

The EU AI Act isn’t a PDF you “comply with” — it’s a set of control objectives you design into your product: evaluation, documentation, monitoring, and provable safety boundaries.

Axel Domingues

Axel Domingues

Regulation is usually framed as paperwork.

But if you’ve shipped real systems, you know the truth:

regulation is architecture — with deadlines and penalties.

The EU AI Act is the first time many teams will be forced to operationalize that idea for AI:

  • classify your system by risk
  • put safety controls in the loop
  • prove you did it (documentation + logging + monitoring)
  • and keep proving it as the system evolves

This article is about turning that into something an engineering org can actually execute.

Not with vibes.

With controls and evidence.

I’m not a lawyer. This is engineering guidance: how to translate legal requirements into systems (controls, telemetry, processes, evidence).

If you’re shipping into regulated domains, involve legal/compliance early — but don’t outsource architecture to them.

The goal

Turn “EU AI Act compliance” into an implementable control map + evidence pipeline.

The mental model

Treat the Act like a set of control objectives — not a checklist.

The output

A living system: model registry, eval harness, audit logs, incident response.

The trap

If you add compliance after shipping, you’ll rebuild everything under pressure.


The EU AI Act in One Picture: Risk Drives Obligations

Most teams get stuck because they start in the wrong place: “What does the Act say?”

Start here instead:

What risk category are we in — and what obligations follow?

At a high level, the mental buckets look like:

  • Unacceptable risk (prohibited practices): don’t build / don’t deploy
  • High-risk AI systems: strong requirements + conformity assessment + continuous obligations
  • Transparency obligations (certain systems): tell users it’s AI, label synthetic content, etc.
  • Minimal risk: largely unaffected — but still governed by general product safety + data protection
Even if you’re “not high-risk”, the Act will still shape expectations for:
  • how you document and monitor the system
  • how you label AI-generated content
  • how you respond to incidents Those practices become table stakes.

Timeline Reality: Compliance Is a Roadmap Problem

A regulation with phased application dates creates a new kind of technical debt:

time-bound debt.

If you treat this like “we’ll fix later”, “later” arrives on a calendar.

From the Act’s own entry-into-force and application section, key dates include:
  • 2 Feb 2025: early application for core chapters (including the prohibited practices chapter)
  • 2 Aug 2025: governance + general-purpose AI model obligations begin applying (with additional transitional rules)
  • 2 Aug 2026: the Act applies broadly
  • 2 Aug 2027: certain high-risk classification / obligations kick in later for specific provisions
These phased dates are why EU AI Act work should be a product roadmap track, not a “compliance sprint”.

Engineering takeaway

Your compliance program must be staged: inventory first, then controls, then evidence automation.

Product takeaway

If AI is on your critical path, “legal deadlines” become architecture milestones.


Regulation as Architecture: Controls + Evidence

Here’s the move that makes this tractable:

A regulation is a set of control objectives.
Compliance is the ability to show evidence that your controls are operating.

That’s it.

So the question is not:

“Are we compliant?”

It’s:

  • What controls do we have?
  • What evidence proves they are working?
  • What breaks when the system changes?

This is exactly the mindset shift we learned in 2022 (operational architecture):

  • you don’t “have reliability”
  • you have SLOs, alerts, runbooks, incident reviews, and change management

AI compliance is the same category of work.


Step 1: Build an AI System Inventory (Because You Can’t Control What You Can’t Name)

Most organizations fail compliance because they can’t answer basic questions:

  • What models are we using? Which versions?
  • Where do prompts live? Who changes them?
  • Where does user data go?
  • Which features are “AI-assisted” vs “AI-decides”?

So the first deliverable is boring — and essential:

an AI system registry.

Define your “AI system boundary”

Decide what you include in scope:

  • the model(s)
  • prompts/system instructions
  • retrieval + tool calls
  • post-processing + safety filters
  • human review steps
  • downstream consumers of the output

Create an inventory record per feature

For each AI feature, record:

  • purpose + user impact
  • model vendor or open-weights source
  • data inputs (PII? sensitive data?)
  • tool access / side effects (email? payments? deletion?)
  • deployment scope (EU users? internal-only?)
  • failure modes you already know about

Assign roles (provider / deployer mindset)

Even if you’re “just integrating” a model, you still own:

  • the product UX
  • the system behavior
  • the safety boundary

So assign internal owners:

  • product owner
  • engineering owner
  • risk/compliance owner
If you don’t build an inventory early, everything else becomes theater:
  • you won’t know what to document
  • you won’t know what to monitor
  • you won’t know which changes require re-evaluation

Once you have an inventory, the next move is to translate “requirements language” into engineering controls.

A practical translation for high-impact AI systems looks like this:

Control family A: Risk management + safety boundaries

Control objective: you identify harms, mitigate them, and verify mitigations.

Engineering controls:

  • risk register per system (hazards + mitigations + owners)
  • red teaming / adversarial tests (prompt injection, jailbreaks, policy bypass)
  • sandboxed tool execution (deny-by-default; scoped permissions)
  • fallback modes (no side effects; human approval; safe response)

Evidence:

  • risk assessments with dates + sign-off
  • test results from your eval harness
  • change logs that show mitigations weren’t silently removed

Control family B: Data governance

Control objective: your training/finetuning/evaluation data is lawful, relevant, and controlled.

Engineering controls:

  • dataset provenance + license tracking
  • data minimization (collect less, keep less)
  • PII handling: redaction, hashing, isolation, retention limits
  • “no-leak” tests (system prompts, secrets, customer data)

Evidence:

  • dataset registry entries (source, license, retention)
  • automated scans and audit logs for access
  • privacy impact assessments when applicable

Control family C: Transparency + user information

Control objective: users understand when they’re interacting with AI and what limitations exist.

Engineering controls:

  • UI labeling (“AI-generated”, “AI-assisted”)
  • confidence indicators that are earned (tied to eval outcomes)
  • “why this answer?” for RAG (citations + snippets)
  • clear escalation to a human when stakes are high

Evidence:

  • screenshots of UX patterns
  • release notes showing consistent labeling
  • telemetry showing users used escalation paths

Control family D: Human oversight

Control objective: humans can supervise, intervene, and override when required.

Engineering controls:

  • review queues for high-stakes outputs
  • approval gates before irreversible actions
  • “operator console” for investigation + rollback
  • kill switches (feature flag + model routing off switch)

Evidence:

  • audit logs of review actions
  • incident timelines showing kill switch usage
  • access control policies for who can override

Control family E: Robustness, security, and monitoring

Control objective: the system is resilient, secure, and monitored for drift and failures.

Engineering controls:

  • model + prompt versioning
  • runtime guardrails (policy filters, PII filters, tool ACLs)
  • monitoring for:
    • abuse patterns
    • hallucination proxies (e.g., citation mismatch rate)
    • cost spikes / latency SLO breaches
    • tool call anomalies
  • incident response playbooks for AI-specific failures

Evidence:

  • dashboards + alerts tied to thresholds
  • incident reports and postmortems
  • penetration tests / threat models

The Evidence Pipeline: Make Compliance a Byproduct of Operating the System

Here’s the engineering secret:

If your AI product is operable, compliance evidence becomes cheap.

If your AI product is not operable, compliance evidence becomes impossible.

So you want an architecture where:

  • every model/prompt/tool change is tracked
  • every deployment produces an eval report
  • every incident produces a timeline and corrective action
  • every “claim” (accuracy, safety, robustness) is tied to data

Think of this as “CI/CD for trust”.

A model registry

A single source of truth for model versions, prompts, policies, and tool permissions.

An eval harness

Repeatable tests that run on every change — with stored results.

Observability + audit logs

Telemetry for safety, drift, cost, and misuse — plus immutable audit trails.

An audit packet generator

One click: “show me the evidence” for this system, this version, this date range.


A Practical Reference Architecture for EU AI Act Readiness

You don’t need a mega-platform.

You need a few boring, durable primitives.

1) Model & Prompt Registry

Store:

  • model identifier + version (vendor or open-weights commit)
  • system prompt / policies (versioned)
  • tool list + permissions
  • risk tier classification
  • owner + escalation contact

Design note: treat prompts as code. PRs. Reviews. Rollback.

2) Evaluation Service

A service that can:

  • run curated test suites (functional, safety, adversarial)
  • run RAG-specific tests (retrieval quality + citation correctness)
  • run tool-use tests (permission boundaries, sandbox escape attempts)
  • produce a signed report artifact per run

Design note: evaluations are not just “accuracy”. They are controls verification.

3) Runtime Guardrails Layer

At inference time:

  • input filtering (PII, secrets)
  • policy enforcement (content policy, domain policy)
  • tool policy enforcement (deny-by-default, per-user scope)
  • output filtering / formatting (citations required, structured outputs)

Design note: guardrails must be measurable. If you can’t measure them, you can’t prove they operate.

4) Monitoring + Incident Response

You want:

  • safety metrics (refusal rate, escalation rate, policy violation attempts)
  • quality metrics (user feedback, correction rate, citation mismatch proxies)
  • security metrics (prompt injection attempts, tool call anomalies)
  • cost & latency metrics (budget enforcement)

And you want playbooks:

  • “prompt injection wave”
  • “model regression”
  • “unsafe tool behavior”
  • “sensitive data leakage report”
If your only safety mechanism is “prompting harder”, you will not pass the reality test.

A prompt is not a control.

A prompt is a policy suggestion.

Controls are things that fail safely when the model misbehaves.

Turning the EU AI Act Into Tickets: A Minimal, Executable Backlog

If you’re leading an engineering org, you need a plan that decomposes.

Here’s a minimal backlog that actually ships.

Ship the inventory + ownership map

  • registry table / service
  • one entry per AI feature
  • owners + escalation paths

Add versioning and change control

  • prompts in git
  • model versions pinned
  • feature flags for routing + rollback

Build the eval harness (start small)

  • 50–200 “golden” cases per feature
  • adversarial cases for known failure modes
  • store results + diffs over time

Add runtime guardrails

  • tool ACLs + sandbox
  • PII redaction on input/output where needed
  • citations required for knowledge answers (when applicable)

Add monitoring + incident workflow

  • dashboards + alerts
  • an incident runbook template
  • a postmortem process that feeds back into eval cases

Add the audit packet generator

  • export registry + last eval results + deployment history + incident summaries
  • one PDF/zip per audit request window

Resources

EU AI Act — official text (Regulation (EU) 2024/1689)

The source of truth: risk categories, obligations, timelines, and definitions — start here before translating requirements into controls.

NIST AI RMF 1.0 (risk → controls → monitoring)

A practical engineering-friendly framework for turning “risk” into control objectives, metrics, and governance you can actually operate.

ISO/IEC 42001 (AI management system standard)

A management-system blueprint for AI: policies, roles, lifecycle controls, continual improvement — the “org spine” your evidence pipeline hangs on.

CEN-CENELEC JTC 21 (EU AI standardization workstream)

Where the harmonized standards work happens — useful for mapping “legal requirements” to “technical ways of meeting them.”


FAQ


What’s Next

This month was about making regulation operational:

  • controls you can implement
  • evidence you can generate automatically
  • and a compliance spine that doesn’t collapse when you change models

Next month we go deeper into the runtime itself:

Reasoning Budgets: fast/slow paths, verification, and when to “think longer”.

Because once you’re operating within compliance constraints, the next question becomes:

How do you spend “thinking time” like a budget — and prove the system used it wisely?

Axel Domingues - 2026