
Google’s Gemini Spark and AI Mode point to a bigger shift: Search is no longer just retrieval plus ranking. It is becoming a runtime for synthesis, generated UI, monitoring agents, and action. This post explains the architecture of actionable retrieval — and the reliability contracts needed when search starts doing things.
Axel Domingues
May’s hot topic was not simply “Google added more AI to Search.”
That story is too small.
The bigger shift is this:
Search is becoming an agent runtime.
For twenty-five years, Search mostly meant:
Generative search changed the first half:
But the May 2026 direction pushes further:
That is a new architecture.
Not “better autocomplete.”
Not “chatbot inside a search page.”
A runtime where retrieval, synthesis, planning, UI generation, and action share the same surface.
A bad answer is one failure mode. A bad answer that schedules, buys, emails, books, or monitors on your behalf is a different system entirely.
The trend
Search is moving from retrieval interface to agentic action surface.
The architecture shift
The pipeline becomes: query → retrieve → synthesize → plan → generate UI → act → monitor.
The reliability problem
If answers become actions, source fidelity, permissions, and auditability stop being optional.
The engineering thesis
Actionable retrieval needs contracts: evidence packets, action gates, source scoring, and rollbackable workflows.
Classic Search was a ranking system.
It returned documents and made the user responsible for:
That was imperfect, but it had a clean boundary:
Search helped you find information.
You performed the action.
Agentic Search blurs that boundary.
Now the system may:
That is why this topic belongs in an architecture blog.
The product surface changed, but the deeper change is the control flow.
Retrieval used to end at “here are relevant documents.”
Actionable retrieval ends at:
“Here is the answer, here is the evidence, here is the proposed action, and here is the control boundary before anything changes.”
That adds three new responsibilities to the search stack.
Evidence
The system must know where claims came from and whether cited sources actually support them.
Intent
The system must separate “I want information” from “I want you to do something.”
Authority
The system must know which actions it is allowed to take, under whose identity, and with what approval.
Memory
The system must decide what persists: monitors, preferences, tasks, reminders, and user-specific context.
This is where Search becomes much closer to an agent platform than a web page.
AI Mode changes the shape of search from “query and links” to “conversation plus workspace.”
That matters because long, messy queries become normal:
The query is no longer a short keyword string. It becomes a task packet.
A task packet can include:
Gemini Spark is interesting because it points to the other side of agentic search:
background continuity.
Search traditionally answered now.
An always-on agent can:
This is not “one request.” It is a workflow.
A Spark-like agent needs a durable loop:
That looks less like Search and more like a tiny SRE system for your digital life.
Here is the mental architecture I’d use for agentic search.

Classifies the request:
Contract: don’t treat every query as permission to act.
Decides what kind of retrieval is needed:
Contract: source selection must be inspectable, not magic.
Scores sources for:
Contract: citation is not enough. The cited page must support the claim.
Builds the response from evidence:
Contract: separate “source says” from “model infers.”
Creates task-specific interfaces:
Contract: generated UI must still call governed APIs, not bypass product controls.
Turns a user goal into proposed steps:
Contract: plans are proposals until authority is granted.
Controls side effects:
Contract: the model is not the security boundary. The gateway is.
Records what happened:
Contract: if the system acts, it must leave an audit trail.
This is the difference between “AI search feature” and “search runtime.”
Generative search has one uncomfortable property:
It collapses many sources into one answer.
That creates convenience, but also removes friction.
Users no longer inspect ten links. They read the synthesis.
So the system inherits editorial responsibility.
Source selection
Which pages were retrieved, ranked, included, ignored, or suppressed?
Claim support
Does each important claim actually follow from the cited source?
Conflict handling
When sources disagree, does the answer expose disagreement or average it away?
Publisher impact
If answers replace clicks, source ecosystems and incentives change.
A useful internal invariant:
A cited answer should be decomposable into claims, and each high-risk claim should map to supporting evidence.
This is not academic neatness. It is how you debug hallucination in a search product.
Agentic Search must distinguish four levels of authority.
The system can answer, summarize, compare, and cite.
No side effects.
The system can fill forms, draft messages, create plans, and stage actions.
Still no side effects.
The system can request explicit approval for a specific action.
The approval must show:
The system performs the action through a governed tool gateway.
Every action gets logged. Irreversible actions require stronger gates.
That is how helpful agents become incident generators.“I prepared this” becomes “I did this.”
Generated UI is one of the most interesting parts of agentic search.
Instead of returning a static answer, the system can generate:
That is powerful because the UI can fit the task.
But generated UI also creates a governance problem:
Who decides what this UI is allowed to do?
A generated booking interface must still respect:
The UI may be generated. The permissions must not be.
A search agent that watches the web for you is extremely useful.
It is also easy to forget.
That makes lifecycle management non-negotiable.
Every monitor should have:
Monitor contract
A background search agent should be visible, bounded, revocable, and explainable.
Without that, you get zombie agents:
This is where the engineering discipline shows up.
Symptom: answer cites sources, but the specific claim is not supported by those sources.
Fix: claim-level evidence checks for high-risk answers, plus source-support evals.
Symptom: answer appears objective, but mostly comes from one source type or one ecosystem.
Fix: source diversity scoring and explicit conflict surfacing.
Symptom: the system performs a task when the user only asked for information.
Fix: intent classification + authority levels + explicit approval for side effects.
Symptom: private context shapes an answer or action in a way the user didn’t expect.
Fix: context disclosure (“using your calendar/preferences”), scoped personalization, and opt-out controls.
Symptom: background agents continue running after the original need is gone.
Fix: expiration dates, visible agent inventory, and regular “still needed?” confirmations.
Symptom: a custom UI exposes actions that should require stricter gates.
Fix: generated UI calls only through a permissioned tool gateway.
Agentic Search needs dashboards beyond click-through rate.
A minimal runtime dashboard should include:
Retrieval quality
Source diversity, freshness, authority, and query-source alignment.
Claim fidelity
Unsupported-claim rate, citation mismatch rate, and conflict handling quality.
Action safety
Approval rate, denied actions, risky tool calls, and rollback frequency.
Agent lifecycle
Active monitors, expired agents, budget usage, and notification usefulness.
The agentic search metric is closer to: Did the system help the user complete the right task, with the right evidence, under the right authority?
I would not start with “make Search more intelligent.”
I would start with contracts.
A query must be classified before the system can act. The default should be informational.
Every synthesized answer should include source IDs, claim mappings, freshness, and confidence metadata.
Bookings, purchases, messages, calendar changes, and subscriptions must never be raw model actions.
The user sees what will happen before it happens.
Every monitor has owner, scope, budget, expiry, and kill switch.
Query → sources → answer → UI → proposed action → approval → tool call.
If you can’t reconstruct it, you can’t operate it.
Search used to be an index.
Generative Search made it an answer engine.
Agentic Search turns it into a runtime.
May takeaway
When Search starts acting, retrieval quality is only the first problem.
The real architecture is actionable retrieval: evidence fidelity, intent boundaries, generated UI, tool gateways, background monitors, and auditability.
Google — A new era for AI Search
Google’s May 2026 framing of AI Search: advanced model capabilities, search agents, agentic coding, and a new AI-powered Search box.
Google — The Gemini app becomes more agentic
The Gemini Spark announcement: a 24/7 personal AI agent designed to manage tasks under user direction.
The Verge — Google Search AI update
Useful product-level coverage of AI Mode, agentic search capabilities, file/context attachments, and generated UI.
Measuring Google AI Overviews
A May 2026 measurement study on activation, source quality, claim fidelity, and publisher impact in AI Overviews.
No.
RAG retrieves evidence for generation.
Agentic search adds:
That makes it closer to an agent platform than a retrieval pattern.
Action overreach.
A wrong answer is bad. A wrong answer that triggers a booking, purchase, notification, or account change is worse.
That’s why action gates and tool gateways matter.
No.
Citation is necessary but not sufficient.
The important question is whether the cited source actually supports the specific claim. That requires claim-level checking, especially for high-risk topics.
Copy the architecture, not the hype: