How Google AI Search Actually Works: RAG and Query Fan-Out Explained

How Google AI Search Actually Works: RAG and Query Fan-Out Explained

When you ask Google a question today, the answer you see isn’t conjured from some vast AI memory. It’s assembled, in real time, from your indexed web pages — using two mechanisms Google has now officially named for the first time.

On 15 May 2026, Google published its official AI Optimisation Guide for Search — the first time the company has formally documented how generative AI features like AI Overviews and AI Mode retrieve and synthesise information. Buried within its measured, bureaucratic prose are two mechanisms that, once you understand them, fundamentally change how you should think about content strategy.

Those two mechanisms are Retrieval-Augmented Generation (RAG) and Query Fan-Out. Together, they form the engine room of AI search. This article explains what they are, how they interact, and what the practical implications are for anyone who creates content online.

The Old Mental Model Is Wrong

Most content creators and SEO practitioners still operate with an intuitive but outdated mental model of how AI search works. The assumption is that the AI model — the large language model (LLM) at the heart of Google’s AI Overviews — simply ‘knows’ a lot of things, and that it draws on that knowledge to write answers.

This is incorrect, and Google’s guide makes clear why. A raw LLM has a training cutoff. It hallucinates. It cannot access the live web. Left to its own devices, it would confidently answer questions about current events with information that may be months or years out of date.

Google’s solution to this is a two-stage architecture. And understanding that architecture is the key to understanding what it actually means to ‘optimise for AI search.’

Stage One: Retrieval-Augmented Generation (RAG)

RAG is not a new concept in AI research, but Google’s public confirmation of its use in Search is significant. Google’s guide describes it as:

Google’s Definition

A technique (also known as grounding) used to improve the quality, accuracy, and freshness of AI responses by relying on our core Search ranking systems to retrieve relevant, up-to-date web pages from our Search index. Our systems then review the specific information from those retrieved pages to generate a more reliable and helpful response, showing prominent, clickable links to relevant web pages that support the information in the response.

Let us unpack this definition carefully, because every word matters.

What RAG Actually Does

In a RAG system, the AI model does not generate an answer from memory alone. Instead, a retrieval system — in Google’s case, its core Search ranking infrastructure — first identifies a set of relevant web pages. Those pages are then passed to the language model as context. The model synthesises them into a response and cites them as sources.

Think of it this way: the LLM is the writer, but it cannot write without a research brief. RAG is the process of assembling that research brief — from your indexed web pages — before the writing begins.

The Critical Implication: Indexing Is the Gateway

Google’s guide is explicit on this point: to be eligible to appear in generative AI features, a page must be indexed and eligible to be shown in Google Search with a snippet. There is no alternative path in. No special AI markup. No llms.txt file. No machine-readable format. Just standard crawlability and indexability — the same foundations that have governed SEO for two decades.

Key Takeaway

RAG means AI Overviews are downstream of Search. If your page cannot rank, it cannot be retrieved. If it cannot be retrieved, it cannot be cited. The AI is only as good as the index it draws from — and access to that index is earned through conventional SEO.

What RAG Means for Content Quality

Because the model synthesises information from retrieved pages, the quality of what it produces is directly shaped by the quality of what it retrieves. Google’s guide puts particular emphasis on what it calls ‘non-commodity content’ — material that provides a unique point of view, first-hand experience, or expert insight that goes beyond what could easily be produced by a generative AI model itself.

This is not incidental. If AI systems are grounding their responses in your content, and your content is itself a generic summary of existing information, the model is essentially building on nothing. Non-commodity content — opinionated, experienced, specific — gives the model something worth citing.

AI search workflow infographic

Stage Two: Query Fan-Out

If RAG is how AI search retrieves information, Query Fan-Out is how it decides what information to retrieve. Google’s guide describes it as:

Google’s Definition
A set of concurrent, related queries generated by the model to request more information and fetch additional relevant search results to address the user’s query. For example, if the original user’s query is ‘how to fix a lawn that’s full of weeds’, fan-out queries might include ‘best herbicides for lawns’, ‘remove weeds without chemicals’, and ‘how to prevent weeds in lawn’.

This is where things get genuinely interesting for content strategy — and where the implications of AI search diverge most sharply from classic keyword-based thinking.

How Fan-Out Works in Practice

When a user submits a query, the AI model does not treat it as a single retrieval task. It decomposes the query into a cluster of related, more specific sub-queries that collectively build a more complete answer. These sub-queries run concurrently — simultaneously — rather than sequentially.

The model is, in effect, conducting keyword research on behalf of the user. It is anticipating the follow-up questions, the adjacent topics, the different angles from which a user might approach the same underlying need. It then retrieves web pages for each of those sub-queries, and synthesises the results into a single, coherent response.

The Fan-Out Analogy

Imagine a senior analyst being asked a broad question by a client. Rather than answering from the top of their head, they immediately dispatch several junior researchers to look into specific sub-aspects: one on pricing, one on regulation, one on competitive landscape. The analysts work in parallel, return their findings, and the senior analyst synthesises a comprehensive briefing.

That is Query Fan-Out. Your content is one of those junior researchers. The question is whether the model dispatches anyone to retrieve it.

How RAG and Query Fan-Out Work Together

The two mechanisms are sequential and interdependent. Here is the full picture of what happens between a user typing a query and an AI Overview appearing on screen:

  1. User submits a query.
  2. Query Fan-Out fires: the AI model generates a set of concurrent sub-queries that collectively address the user’s underlying information need.
  3. RAG retrieval runs: for each sub-query, Google’s core Search ranking systems retrieve relevant, crawlable, indexed pages from the Search index.
  4. Synthesis: the model reviews the retrieved pages, extracts the most relevant information, and writes a grounded response with clickable citations.
  5. AI Overview is displayed: the user sees a synthesised answer with prominent, clickable links to the source pages that informed it.

The key architectural insight is this: RAG makes the quality of the Search index the quality ceiling for AI responses. And Query Fan-Out determines which corners of that index get consulted. Your content strategy must address both.

What This Means for Your Content Strategy

1. Single-Keyword Optimisation Is Increasingly Insufficient

If the AI is generating a cluster of sub-queries around every user question, then a page optimised for exactly one keyword is only eligible for retrieval on exactly one of those sub-queries. Content that covers a topic comprehensively — addressing related questions, edge cases, and adjacent concepts — is eligible for retrieval across multiple fan-out paths.

This is not a call to stuff more keywords onto a page. Google’s guide explicitly warns against creating separate content pages for every possible query variation, calling it a violation of its scaled content abuse policy. The answer is depth and coherence, not volume.

2. Topical Authority Becomes More Valuable

A site with a coherent cluster of high-quality content around a topic is more likely to be retrieved across multiple fan-out queries than a site with isolated, single-article coverage. If Google’s AI fans out from a user’s query into five sub-queries, and your site has strong indexed content on three of those five, you have three citation opportunities where a competitor with a single page has one.

This is the structural argument for content clustering — and it is now backed by a documented mechanism rather than an SEO hypothesis.

3. Crawlability Is Non-Negotiable

RAG can only work with what the Search index contains. Google’s guide reiterates the foundational requirement: pages must be crawlable, indexable, and eligible to appear with snippets. A technically broken site — one with crawl blocks, thin content penalties, or JavaScript rendering failures — is invisible to the RAG retrieval process regardless of how well-written its content is.

4. First-Hand, Expert Content Has a Structural Advantage

Because the model is synthesising from retrieved pages, it is drawn to content that offers something the model itself cannot produce: genuine first-hand perspective, specific data, or lived experience. Generic, AI-summarised content loses its value proposition the moment the model can produce the same thing itself. Original, non-commodity content is what earns citations.

RAG vs. Query Fan-Out: At a Glance

Retrieval-Augmented Generation (RAG) Query Fan-Out
What it is What triggers retrieval
Grounds AI responses in live, indexed web pages Generates concurrent sub-queries from a single user query
Determines the quality and freshness of the AI answer Determines which corners of the index get consulted
Gateway: standard crawlability and indexing Gateway: topical breadth and semantic coverage
Your page must be indexed to participate Your page must cover the right sub-topic to be retrieved
Produces clickable citations in AI Overviews Shapes the scope of what the AI researches before answering

What Google’s Guide Says to Ignore

Alongside its explanations of RAG and Query Fan-Out, Google’s guide takes the unusual step of explicitly debunking several popular ‘AI optimisation’ tactics. For the record:

  • txt files: Not required, not treated in any special way. Standard robots.txt and crawlability apply.
  • Content chunking: No requirement to break content into small pieces. Google’s systems understand multi-topic pages.
  • Rewriting content for AI phrasing: AI systems understand synonyms and general intent. There is no specific register or vocabulary required.
  • Inauthentic mentions: Manufactured brand mentions are addressed by the same spam systems as link schemes.
  • Special structured data for AI: Structured data helps with rich results but is not a factor in AI Overview inclusion.

The Bottom Line

Google’s AI search is not magic. It is a two-stage retrieval and synthesis system: Query Fan-Out determines what to look for, RAG determines what to look at, and the LLM determines what to say about it. Your content is a candidate for every stage of that process — but only if it meets the same foundational requirements that have always governed how Google indexes and ranks the web.

The implication for content operators is clarifying rather than disruptive. Build content clusters around topics, not single keywords. Prioritise original, first-hand insight over commodity summaries. Maintain technical hygiene so your pages remain crawlable. These are not new ideas. They are existing best practices, now explicitly confirmed as the mechanism by which AI search works.

The question is no longer whether AI search changes everything. It is whether your content is in the index when the fan-out queries arrive.

Frequently Asked Questions

What is RAG in Google Search?

RAG stands for Retrieval-Augmented Generation. In Google Search, it is the mechanism by which AI Overviews ground their answers in live, indexed web pages rather than relying solely on the model’s training data. Google’s systems retrieve relevant pages from the Search index and pass them to the AI model to synthesise a response — with clickable citations.

What is Query Fan-Out in Google Search?

Query Fan-Out is the process by which Google’s AI model automatically generates multiple related sub-queries from a single user question. These sub-queries run concurrently to gather information from different angles, and the results are synthesised into one coherent AI Overview response.

Does Query Fan-Out change how I should do keyword research?

Yes. Because Google’s AI generates several related sub-queries behind the scenes, content that covers a topic cluster — not just a single keyword — is more likely to be retrieved and cited. Keyword research should now map semantic clusters, not just individual terms.

Is AEO or GEO different from SEO for Google Search?

According to Google’s official AI Optimisation Guide published in May 2026, optimising for generative AI search is still SEO. Google states that its AI features are rooted in the same core Search ranking and quality systems, so foundational SEO best practices remain the primary lever.

Do I need an llms.txt file to appear in AI Overviews?

No. Google explicitly states in its official guide that llms.txt files and other special AI markup are not required or treated in any special way for AI Overviews. Standard crawlability, indexability, and content quality remain the key factors.

One thought on “How Google AI Search Actually Works: RAG and Query Fan-Out Explained

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.