What is retrieval-augmented generation (RAG)?

RAG is the architecture that lets an AI model answer with current information without retraining. The query is converted to an embedding, that embedding is matched against a vector index of source documents, the top-k matches are stuffed into the prompt as context, and the model generates an answer grounded in those passages. Every public-web AI engine in 2026 uses some form of RAG — the differences are in what the index covers, how retrieval is ranked, and how strict the grounding constraint is on the model's output.

Why does my page rank #1 organically and still not get cited by ChatGPT?

Three common reasons. First, the engine's retrieval ranks differently than Google's organic — embedding-based retrieval prefers passages that semantically match the query intent, which doesn't always align with the keyword-and-link signals that win position 1. Second, the engine may have a different grounding source set (e.g. ChatGPT preferring Wikipedia, news outlets, or product-doc sites for certain query types). Third, your page may rank for the head term but not for the long-tail reformulation the engine actually retrieves against — engines often rewrite the user's query into 3–5 internal queries and your page only matches the head.

Are entity signals more important for AI engines than for Google?

Yes. AI engines build internal representations of entities (brands, people, places, concepts) and reason at the entity level rather than the keyword level. A query about 'best CRM for sales teams' triggers entity-level retrieval — the engine pulls candidate brands, then evaluates their content. A brand with weak entity signals (no Wikipedia, no Wikidata, inconsistent NAP, sparse schema) gets disambiguated as multiple entities or skipped entirely. Strong entity signals — schema, citations, brand search volume, knowledge graph presence — are the price of admission to the candidate set.

How fast do AI engines pick up new content?

Variable by engine. Perplexity and AI Overviews can surface a new URL within hours of publication if the page is well-linked and the query is fresh-news-relevant. ChatGPT's browsing tool can also pull recent content but the model's preferences for established sources mean recency alone isn't enough. Claude's browsing similarly pulls live but with conservative source weighting. Gemini sits closest to Google's normal indexing speed. The implication: get the content indexed in Google and Bing first, get internal links pointing at it, and the AI surfaces follow within days.

AI SEO

Chapter 01 / 08

How AI engines rank

Retrieval, embeddings, grounding, and citation logic — the four mechanics that decide whether a passage of your content ends up in an AI answer or never gets read at all.

9 min readPublished May 8, 2026

AI engines don't rank URLs the way Google ranks URLs. They retrieve passages, evaluate them against the user's query, and either cite a source or generate an answer without one. Optimizing for them requires understanding the four mechanics underneath every public-web AI engine in 2026: retrieval, embeddings, grounding, and citation logic. Once those four are clear, the engine-specific chapters that follow this one make sense as variations on the same model rather than four entirely different problems.

“Classical SEO ranks URLs against a query. AI SEO retrieves passages, weighs them against entity signals, and cites the ones that survive the grounding check. The unit of optimization moved from the page to the passage — and most of the optimization work nobody is doing yet happens at that level.”

Mechanic 1 — Retrieval

Retrieval is the step where the engine decides which documents to even consider for the answer. Two retrieval models matter:

Lexical retrieval. Classical inverted index, BM25-style ranking. Matches exact terms in the query against the document. This is what Google's organic index is built on, with many layers added on top. Still used as a candidate-set filter even by AI engines.
Embedding retrieval. The query is encoded as a vector; documents are pre-encoded as vectors; the top-k nearest-neighbor matches are returned. Semantic — matches passages that mean something similar to the query, even if no terms overlap. Dominant in AI engines and in the in-context retrieval that grounds answers.

Most AI engines in 2026 run hybrid retrieval — lexical to filter the candidate set, embedding to rank within it. The implication for optimization: pages need both the keywords (for lexical retrieval) and the semantic-content depth (for embedding retrieval) to make the candidate set in the first place.

Mechanic 2 — Embeddings

Embeddings are how AI models represent meaning numerically. A passage of text gets converted into a vector — typically 768 to 3,072 dimensions — that encodes its semantic content. Two passages with similar meanings produce similar vectors; two passages with different meanings produce different vectors. The retrieval index in an AI engine is built on these vectors.

Three implications for optimization:

Topic depth matters. Embeddings reward content that goes deep into a single topic. A 2,000-word page that exhausts a topic produces a tighter, more retrievable embedding than a 2,000-word page that meanders across five topics.
Question-answer pairs are recoverable. Engines often retrieve at the passage level (paragraph, FAQ entry, table row). Content structured as question-answer pairs — explicit FAQs, H2-question + body-answer patterns — is more retrievable than the same information embedded in flowing prose.
Semantic clustering beats keyword stuffing. A page that discusses a topic using related concepts, synonyms, and adjacent terms produces a richer embedding than one that hammers the head keyword. Lexical SEO and embedding SEO point in the same direction once embeddings are involved.

Mechanic 3 — Grounding

Grounding is the constraint that the answer be verifiable against retrieved sources. Engines vary in how strictly they enforce this:

Strict grounding (Perplexity, AI Overviews). The answer must be supportable by retrieved passages, with citations attached. Hallucinations are aggressively suppressed.
Hybrid grounding (Gemini, ChatGPT with browsing). Retrieved passages inform the answer but the model can also fall back on training knowledge, and citations are sometimes produced and sometimes not.
Loose grounding (ChatGPT without browsing, Claude without tools). Training-data answer with no live retrieval. Citations are generated when the user asks for them but are post-hoc and sometimes confabulated.

The optimization implication: tightly-grounded engines reward sources that are easy to ground against — clear claims, dates, structured data, named entities. Loosely-grounded engines reward sources that are well-represented in training data — wide indexing, broad citation, established brand entities.

Mechanic 4 — Citation logic

Citation logic is the rule the engine uses to decide whether to attribute the answer to a source and which source to attribute. Four patterns dominate:

Always-cite (Perplexity). Every answer attempts to cite specific URLs. Citations are first-class output.
Inline-cite when grounded (AI Overviews). Citations attached to specific claims when retrieval found supporting passages; absent when the answer drew from general knowledge.
Reference-cite (ChatGPT, Gemini, Claude with tools). Citations clustered at the end of the answer or inline at the user's request. Sometimes link out, sometimes name-only.
Implicit-cite (any engine without retrieval). The answer mentions sources by name without linking. Brand mentions in this mode still produce real awareness lift but no clickthrough.

How the five engines differ

Each of the next five chapters covers one engine in depth. The summary differences:

ChatGPT: Hybrid retrieval (training + browsing). Strong preference for well-known sources, news, and product documentation. Citation pattern leans implicit unless asked.
Gemini: Heavy live retrieval grounded in Google's index. Closest to organic SEO behavior; the engines whose results you can most directly forecast from organic ranking.
Claude: Conservative source weighting. Strong preference for documentation, primary sources, and well-cited authorities. Browsing optional and produces explicit citations when used.
Perplexity: Always-grounded, always-cited. Live retrieval against the open web with strict citation rules. The engine where new content surfaces fastest if it gets indexed and linked.
AI Overviews: Google's own. Inherits Google's index and ranking signals, applies summarization and citation. Closest overlap with organic SEO but with passage-level retrieval characteristics.

The shared optimization stack

Despite the differences, every AI engine rewards the same underlying signals:

Indexability + crawlability. If your page isn't in the public web index, no AI engine retrieves it. The technical layer is non-negotiable.
Schema and structured data. Engines parse JSON-LD aggressively. Article, FAQPage, HowTo, Product, LocalBusiness, Organization — all read.
Topic depth + passage structure. Long-form, deeply-treated topics with explicit Q&A structure get retrieved at higher rates.
Entity signals. Brand, author, organization claims confirmed across schema, sameAs, Wikipedia, Wikidata, citation graph.
Recency markers. Updated dates in schema, author bylines with credentials, last-modified headers — engines prefer recent and verifiable sources.

The engine-specific chapters layer engine-specific tactics on top of this shared stack. The next chapter, ChatGPT optimization, starts with the most-used AI engine in the world.

Common questions

Quick answers to what we get asked before every trial signup.

Some do, some don't. Perplexity and Google AI Overviews run live web retrieval against the public index for every query. ChatGPT and Claude blend a static training cutoff with live retrieval triggered by specific tools (browsing, search, file lookup). Gemini sits between — heavy live retrieval grounded by Google's index, plus a strong training base. The implication: optimizing for AI engines is not just optimizing the training data, it's optimizing the live retrieval surface, which is the public index your URL is in right now.

In this cluster

AI SEO

Back to the academy

Academy

Next chapter

02. ChatGPT optimization

Product

Resources

Company

How AI engines rank

Mechanic 1 — Retrieval

Mechanic 2 — Embeddings

Mechanic 3 — Grounding

Mechanic 4 — Citation logic

How the five engines differ

The shared optimization stack

Common questions

AI SEO

See the OS in Action

Quick context, then book