
The EU AI Act turns “model choice” into a regulated interface. This month is a practical playbook: what GPAI providers must ship, what enterprises must demand, and how to build compliance into your agent platform without slowing delivery.
Axel Domingues
Last month was security for connectors.
This month is what happens when security stops being “best practice”… and becomes a regulated contract boundary.
Because in late 2025, “pick an LLM provider” is no longer just a technical decision.
It’s a compliance decision.
And that changes the architecture.
Not in a philosophical way.
In a boring, operational way:
This post is about the moment GPAI obligations begin—and what it forces model providers and enterprises to change.
The point is to translate obligations into interfaces, artifacts, and operational controls you can ship.
The shift this month
Regulation turns the model layer into a governed supply chain.
The core design insight
Compliance isn’t paperwork.
It’s a control plane you build into the platform.
The practical outcome
Providers must ship evidence + instructions.
Enterprises must ingest + enforce them.
The failure mode to avoid
Treating compliance as a “doc sprint” after launch.
“GPAI” (General-Purpose AI) is the foundation model layer: models that can be used across many tasks and then adapted downstream.
That matters because this layer behaves like a platform dependency, not a feature:
So the AI Act-era move is predictable:
If a component is a platform dependency, it gets a governance interface.
The governance interface isn’t just “security.”
It’s documentation, transparency, and post-market monitoring as a routine part of operating the model.
Before:
After:
That framing is what stops this from turning into chaos.
This post is split into two tracks:
Because most teams lose a year by mixing these up.
You become a platform operator for downstream teams.
When obligations kick in, the model stops being “just an endpoint.”
It becomes a dependency with versioned evidence.
Think of it like this:
Here’s the simplest mental model I’ve found useful:
Evidence artifacts
What the provider can prove about the model: documentation, limitations, testing, policies, summaries.
Operational duties
How the provider runs the model: monitoring, incident response, security posture, change control.
Downstream constraints
What the provider expects you to do: proper use, limits, warnings, human oversight patterns.
Auditability
The ability to reconstruct “what happened” for a given output: logs, versions, configurations.
The rest of this article is: how to build these interfaces into your platform.
If you are a provider, you are no longer shipping “a model.”
You are shipping:
A normal model card is a description.
A System Card is a contract:
If you don’t ship this as a stable artifact:
Providers should assume downstream teams will ask for:
Which means you need artifact versioning as a first-class system:
You have stories.
For the class of models treated as higher risk, safety becomes operational:
That means safety needs the same things SRE needs:
If it’s a set of jobs, dashboards, and runbooks, it might.
Enterprises tend to make the same mistake:
They treat regulation as a procurement problem.
It is also a runtime problem.
Because the AI Act doesn’t care that your architecture diagram is “clean.”
It cares whether you can:
So the main shift is simple:
You need a Model Governance Control Plane.
If you don’t build it, you’ll rebuild it repeatedly, badly, in each product team.
A control plane is how platforms avoid chaos.
For GPAI, your MGCP is the place where you answer:

Model registry
Inventory of models, versions, vendors, use-cases, risk tags, owners, and approvals.
Evidence store
Versioned storage for provider packets: docs, policies, eval results, and internal sign-offs.
Connector gateway
A single controlled integration surface: auth, quotas, routing, policy checks, logging.
Eval harness in CI
Regression tests for prompts/tools. Blocks model upgrades that break safety and behavior.
Audit log + trace graph
Reconstructable runs: inputs, tool calls, outputs, policies applied, versions used.
Incident workflow
A durable process for “we shipped something bad” that doesn’t rely on Slack archaeology.
Notice what’s missing:
There’s no “compliance team” box here.
Because the platform must make compliance default for engineers.
In August we talked about connector security:
In September, the conclusion is sharper:
The connector is where you enforce governance.
Why?
So in a regulated world, your connector layer should do four things reliably:
Enforce identity and intent
Who is calling the model, for what purpose, under which product and user session.
Apply policy before generation
Scope tools, block risky actions, enforce data boundaries, require approvals for sensitive flows.
Capture the run
Log prompts, retrieved context hashes, tool calls, outputs, and policy decisions with version IDs.
Control upgrades safely
Route by model version, support canaries, rollbacks, and per-use-case pinning.
This is why “LLM gateway” products exist.
But you can build the core yourself if you treat it like an API gateway with extra semantics.
You have an incident waiting for a regulator to ask: “prove it.”
This is the smallest plan I’ve seen work in real orgs.
It’s boring. That’s why it works.
Minimum fields:
Ask for a versioned packet containing:
Do not allow teams to call models directly. Enforce:
Treat model changes like dependency upgrades:
You need:
Monthly is enough at first:
In June we talked about agents as distributed systems.
This month is where that becomes compliance-critical.
A regulated platform needs durable workflows for:
If these workflows rely on “someone remembering,” you will fail audits and reliability simultaneously.
So the same patterns apply:
Treat them like money: durable state machines, retries, and explicit invariants.
Let’s make it real.
You ship a “support agent” that:
In 2024, this is mostly “tool use + guardrails.”
In 2025, it also needs:
That translates to platform features:
This is what I mean by:
Compliance is a runtime surface.
If you don’t version evidence and logs as you ship, you will never reconstruct the past.
You’ll end up trying to rebuild history from production databases and Slack threads.
This creates inconsistent controls and inconsistent auditability.
Centralize governance at the connector layer and expose it as a platform capability.
Traceability isn’t only for regulators.
It’s how you debug multi-agent workflows and tool misuse.
If you’re building agentic products, you need it anyway.
Prompt filtering is not a governance strategy.
Governance is: least privilege, scoped tools, deterministic policy checks, eval in CI, and audit logs.
If you take one list into vendor calls, take this one.
Versioning and change control
How do model updates roll out? Can we pin versions? Can we canary? What’s the rollback path?
Evidence packets
What documentation and safety summaries do you provide per model/version? How do we retrieve past packets?
Security posture
How do you prevent abuse, leakage, and compromise in serving? Do you have incident response SLAs?
Auditability support
What logs do you provide? What are retention defaults? Can we export? How do you support investigations?
Notice how none of these are “what’s your benchmark score.”
That’s not because benchmarks don’t matter.
It’s because in 2025, operability beats capability.
The General-Purpose AI Code of Practice (EU AI Office)
The most “implementation-shaped” artifact for GPAI: concrete measures for transparency, copyright, and (for systemic-risk models) safety/security — plus downloadable templates like the Model Documentation Form.
Commission guidelines for providers of GPAI models
Clarifies key concepts and expectations around GPAI provider obligations. Useful for turning “what the law means” into what your vendor packet and internal model registry must actually contain.
September is the month where agents become explicitly regulated platforms.
The next month is where the governance story meets a brutal UX constraint:
voice.
Because voice agents are where latency, caching, reliability, and handoff stop being “nice-to-haves.”
Next article: Voice Agents You Can Operate: reliability, caching, latency, and human handoff
Voice Agents You Can Operate: reliability, caching, latency, and human handoff
Voice turns LLMs into real-time systems. This month is about building voice agents that meet latency budgets, degrade safely, and hand off to humans without losing context—or trust.
Security for Agent Connectors: least privilege, injection resistance, and safe toolchains
In 2025, the riskiest part of “agentic” systems isn’t the model — it’s the connectors. This month is a practical playbook for securing tools: least privilege, prompt-injection resistance, safe side effects, and auditability that holds up under incident response.