What's the difference between mention rate and citation rate?

Mention rate measures whether your brand name appears anywhere in the AI response. Citation rate measures whether your URL appears in the source list (when the engine shows one). They tell different stories — high mention rate with low citation rate means your brand is part of the conversation but not driving traffic; high citation rate with low mention rate means specific pages are ranking but the brand isn't yet anchored as a category entity. A complete program tracks both.

Should I track AI rankings the way I track Google rankings?

Not directly. AI engines don't return ranked URL lists; they return synthesized answers with sometimes-cited sources. The closest analog is share of voice — what percentage of category-relevant prompts produce a response that mentions or cites your brand. SOV trends matter more than position numbers because position isn't a meaningful concept in an AI answer. Track SOV weekly, segmented by engine and prompt category.

Which prompts should I be tracking?

Three prompt-set tiers. Brand prompts — direct queries about your brand name; the floor everyone needs to monitor for accuracy. Category prompts — informational and comparison queries in your category where you want to be in the answer set ('best CRM for sales teams', 'how does X compare to Y'). Long-tail intent prompts — specific use-case prompts your customers might run ('CRM for a 12-person agency that needs Slack integration'). The full set is typically 50–200 prompts depending on category breadth.

How often should AI mention tracking run?

Weekly is the practical floor. AI engine responses can vary day-to-day even for the same prompt because of model temperature, retrieval freshness, and rolling updates. A weekly cadence with 50–100 prompts per engine produces enough data to see real trends without burning through API budgets. Monthly is too slow to catch optimization-driven changes; daily is overkill and noisy.

AI SEO

Chapter 08 / 08

AI search measurement

How to track whether your AI-SEO program is working — across ChatGPT, Gemini, Claude, Perplexity, and AI Overviews. The metrics, the tools, and the methodology that turns AI search from a faith-based channel into an accountable one.

8 min readPublished May 8, 2026

AI SEO without measurement is faith-based optimization. The work feels productive, the engines are large, the user behavior is real — but without a measurement layer, there's no way to tell if the work is producing citation lift or just adding to the content pile. This chapter covers the metrics, the tools, and the prompt-set design that turn an AI-SEO program from a faith-based investment into an accountable one with weekly reporting and tied-to-outcome optimization.

“The measurement gap is the single biggest reason AI SEO programs stall. Without a weekly mention-rate report, the optimization work is invisible — every change feels like guesswork because there's no signal to test it against. With the report, every published article and every schema fix can be evaluated within a week. Measurement turns AI SEO into a normal optimization discipline.”

The four metrics

Mention rate. The percentage of prompts where your brand name appears in the response, mentioned or referenced. The broadest signal. Tracks awareness presence in the engine's output.
Citation rate. When the engine produces a citation list (Perplexity always, AIO often, Gemini sometimes), the percentage where your URL appears. Tracks attribution and click-driving presence.
Share of voice (SOV). Your mention/citation rate as a fraction of total mentions/citations across you and your top competitors on the same prompt set. The competitive view.
Top-citation rate. Of citations, how often you're in the top 1–3 cited (the most-prominent slots in Perplexity, the lead source in AIO). Tracks dominance, not just presence.

Designing the prompt set

The prompt set is the foundation of the measurement program. Bad prompts produce noisy data; good prompts produce signal you can act on. The structure:

Tier 1: Brand prompts (5–10). "What is [brand]?", "Who founded [brand]?", "[brand] vs [competitor]". The accuracy floor — these should always produce correct, branded responses.
Tier 2: Category head prompts (15–30). "What's the best [category] software?", "How do I choose a [category] platform?", "Top [category] alternatives". The competitive battlefield — where SOV matters most.
Tier 3: Long-tail intent prompts (30–100). "[category] for a [specific persona] that needs [specific feature]", "How do I [specific task] with a [category] tool". The detail layer — where the engines reformulate queries internally.
Tier 4: Comparison prompts (10–20). "[brand] vs [competitor 1] vs [competitor 2]". Tracks how often you're included in comparison sets.

Total: 60–160 prompts per category. Run weekly across all five engines = 300–800 prompt-engine combinations per week. Manageable with API automation; impractical to do manually beyond Tier 1.

The five engines, tracked

Each engine has its own measurement quirks:

ChatGPT. OpenAI API or manual via chatgpt.com. Track both the standard model (training-only) and the browse-enabled model — they produce different results.
Gemini. Google AI Studio API or gemini.google.com. Pair with Search Console for cross-validation of which queries are showing AI responses.
Claude. Anthropic API or claude.ai. The most expensive of the five for high-volume probing; consider weekly rather than daily.
Perplexity. Perplexity API or perplexity.ai. The easiest to measure because every response cites; capture both the answer text and the citation list.
AI Overviews. No direct API. Manual scraping or via tools that proxy through Google's interface. The hardest engine to measure programmatically; rely on Search Console patterns and tooling that handles the scraping.

Tools that do this work

SEOTopSecret AI Mentions. Cross-engine mention + citation tracking, weekly schedules, competitor SOV, prompt history. Built for SEO teams running this as part of a broader growth program.
Profound. Specialized AI-search analytics platform; deep prompt-by-prompt data and citation tracking.
Otterly.ai. AI search rank tracking with focus on prompt monitoring and citation attribution.
Goodie. Lightweight prompt monitoring with weekly reports.
DIY via API. A custom prompt runner via the OpenAI/Anthropic/Google/Perplexity APIs, with a database to store responses and a dashboard for trend analysis. Right for in-house data teams; expensive in dev time but flexible.

Search Console as a partial measurement layer

For Gemini and AI Overviews, Search Console captures impression data even when the click rate is low. Three patterns to look for:

Long-form conversational queries. Queries phrased as full sentences ("how do I X for Y", "what's the best way to Z") increasingly correlate with AI responses. A page receiving impressions on these is likely being cited.
CTR drops with stable position. When AIO appears on a query you used to win, your impression count holds but CTR drops. This is the AIO signature.
New URL appearances on conversational queries. Pages that historically didn't rank in the top 20 suddenly receiving impressions on conversational queries are likely making the AIO citation candidate set from a deeper organic position.

Reporting cadence

A working AI-SEO measurement program reports at three layers:

Weekly tactical report. Mention rate per engine, citation rate per engine, top movers (queries where SOV changed materially). Drives the next week's optimization priorities.
Monthly strategic report. SOV trend per engine, competitor benchmark, content-to-citation correlation (which published content earned citation lift).
Quarterly review. Full prompt-set audit (do the tracked prompts still match user behavior?), tooling cost vs signal value, expansion to new prompt categories.

Tying optimization to measurement

The point of measurement is to test whether optimization moves work. A working test loop:

Pick a query where mention rate or citation rate is below target.
Hypothesize a reason (passage structure, schema, freshness, entity signal).
Implement the change on the relevant page.
Wait 1–4 weeks for the engine to re-index and the metrics to update.
Compare before/after on that specific query. If the metric moves, generalize the change to similar pages. If it doesn't, change the hypothesis.

This loop is what turns AI SEO from a content-volume game into a measurement-driven discipline. The teams winning AI-search SOV in 2026 are the teams running this loop weekly.

What the metrics don't tell you

Conversion attribution. A user who reads your brand in a ChatGPT answer and converts a week later isn't tracked back to the AI engine — they show up as direct traffic or branded search. Attribution to AI surfaces is structurally hard; use lift-style analysis (correlation of SOV growth with branded-search and direct-traffic growth) rather than direct attribution.
Quality of mention. A neutral mention and a positive mention are both mentions. Sentiment analysis on the surrounding context adds a layer but adds noise too.
Causal certainty. A correlated improvement after a change isn't proof the change caused it. Multiple changes per week, prompt-set drift, and engine updates all confound the signal. Treat the metrics as probabilistic, not deterministic.

Closing the cluster

AI SEO in 2026 is a discipline: a clear model of how AI engines retrieve and rank, engine-specific tactics for ChatGPT, Gemini, Claude, Perplexity, and AI Overviews, citation engineering applied to the content stack, and measurement infrastructure that ties optimization work to outcome. None of these eight chapters is sufficient on its own; together they describe the work that decides whether your brand is the answer or absent across the AI surfaces that increasingly intercept the queries Google used to own.

Pair this cluster with the On-Page SEO cluster for the page-level discipline that AI-SEO depends on, the Off-Page SEO cluster for the entity and authority signals that anchor citations, and the Academy hub for the rest of the disciplines.

Common questions

Quick answers to what we get asked before every trial signup.

Three approaches. Manual probing — run a curated set of category prompts in each engine, weekly or biweekly, and log whether your brand appears. Tooling — services like SEOTopSecret's AI Mentions tracker, Goodie, Profound, and Otterly run scheduled prompts and report mention rates over time. API monitoring — set up automated prompt batches via the OpenAI, Anthropic, Google, and Perplexity APIs and analyze responses programmatically. Each method has tradeoffs in cost, coverage, and signal quality.

In this cluster

AI SEO

Previous chapter

07. Citation engineering

Back to the academy

Academy

Product

Resources

Company

AI search measurement

The four metrics

Designing the prompt set

The five engines, tracked

Tools that do this work

Search Console as a partial measurement layer

Reporting cadence

Tying optimization to measurement

What the metrics don't tell you

Closing the cluster

Common questions

AI SEO

See the OS in Action

Quick context, then book