01

Content SEO

Chapter 01 / 07

Keyword research

How modern keyword research actually works in 2026 — discovery, intent classification, difficulty scoring, clustering, and the prioritisation framework that decides what to write first.

11 min readPublished May 4, 2026
Keyword research

Keyword research is the bridge between buyer behaviour and content strategy. Done well, it converts the messy reality of how buyers search into a prioritised list of what to publish next, in what order, with what intent satisfied. Done badly, it produces a spreadsheet nobody acts on or — worse — a content calendar full of articles that target keywords nobody buying ever types.

This article walks through the modern process: discovery, intent classification, difficulty scoring, clustering, and prioritisation. It opens the Content SEO cluster because every other article in the cluster depends on getting this stage right.

The output of keyword research isn’t a list of keywords. It’s a content roadmap, ordered by what to publish first, with intent, difficulty, and expected business impact attached to every line.

Stage 1 — Discovery

The goal of discovery is to surface every query that’s plausibly relevant to your business — including ones the team hasn’t thought of. Five sources to combine:

  • Seed keywords — the obvious 10–30 queries your team would type. Brand name, product name, category, biggest features, primary use cases.
  • Competitor analysis — through Ahrefs / Semrush / Sistrix, pull the queries each major competitor ranks for. Filter to ones you don’t already rank for.
  • Search Console queries — the queries that drove impressions to your existing pages, including ones you didn’t target. Often surfaces unexpected demand.
  • AI engine prompts — ask ChatGPT / Claude / Perplexity what questions buyers in your space ask. The conversational queries surface natural-language phrasings classic tools miss.
  • Customer language — sales call transcripts, support tickets, review sites, Reddit threads, community forums. The language buyers actually use, not the language internal teams use.

Aim for 500–2,000 candidate queries on the first pass. Volume matters — narrow filters work better when there’s plenty to filter from.

Stage 2 — Intent classification

Every query carries one of four intents. The intent decides what content format wins, not the keyword itself.

IntentInformational
Buyer mindsetResearching the problem; not ready to buy yet
Winning content formatArticles, guides, explanations, knowledge base
IntentCommercial-investigation
Buyer mindsetComparing options; evaluating before purchase
Winning content formatComparisons (X vs Y), reviews, alternatives, ranked lists
IntentTransactional
Buyer mindsetReady to buy; needs to find the right thing
Winning content formatProduct pages, pricing pages, demo / signup pages
IntentNavigational
Buyer mindsetLooking for a specific brand or page
Winning content formatBrand homepage, login pages, specific feature pages

How to classify intent

  • Look at the SERP. Query the keyword in Google. The intent is whatever Google decided — if the SERP is full of comparison articles, the intent is commercial-investigation; if it’s product pages, transactional; if it’s how-to guides, informational.
  • Read the query language. “What is X” / “How does X work” = informational. “X vs Y” / “Best X” / “X review” = commercial-investigation. “Buy X” / “X price” / “X discount” = transactional. Brand names alone = navigational.
  • Note SERP features. AI Overviews + featured snippets dominate informational. Shopping carousels + ads dominate transactional. Comparison-style featured snippets dominate commercial-investigation.

Stage 3 — Difficulty scoring

Keyword Difficulty (KD) scores from Ahrefs / Semrush / Moz are imperfect proxies but useful as a first filter. They estimate how hard it is to rank in the top 10 for a query, based on the link profiles of currently-ranking pages.

KD range0–20
Realistic for...New domains, sites starting from zero
Typical patternLong-tail informational; can rank in 4-12 weeks with decent content
KD range20–40
Realistic for...Sites with some authority, established blogs
Typical patternMid-tail commercial-investigation; ranks in 3-9 months
KD range40–60
Realistic for...Established domains with topical authority
Typical patternHead-term informational, mid-volume commercial; 6-18 months
KD range60–80
Realistic for...Major brands, high-authority sites
Typical patternHead-term commercial, generic head terms; multi-year effort
KD range80+
Realistic for...Top 1-3 sites in the niche
Typical patternBrand monopolies, high-stakes commercial head terms

KD is a starting filter, not a verdict. Real difficulty depends on:

  • Your domain authority and topical authority on the theme.
  • The quality of currently-ranking pages — if they’re thin, KD lies; if they’re comprehensive, KD undercounts.
  • SERP features taking up space — AI Overviews, featured snippets, ads can shrink the click pie even when you rank.
  • How well your content matches the actual intent.

Stage 4 — Clustering

Modern keyword research doesn’t target individual keywords — it targets clusters of related queries that one well-built page can rank for simultaneously. A single page can rank for 50–500 related queries; clustering reveals which keywords belong together.

Two clustering approaches:

  • SERP clustering. Two queries are in the same cluster if they share most top-10 results. Tools (Keyword Insights, Surfer, ContentEngine) automate this.
  • Semantic clustering. Two queries are in the same cluster if they cover the same topic, regardless of SERP overlap. AI-driven approach using embeddings.

SERP clustering is more conservative — keywords confirmed to share results. Semantic clustering catches near-duplicates SERP clustering misses. Both produce 80% overlap; senior operators run both and reconcile the diffs manually.

What a cluster looks like

For the cluster around “customer retention”, a single comprehensive article might rank for:

  • customer retention (head term, KD 45)
  • how to improve customer retention (mid-tail, KD 25)
  • customer retention strategies (head term variant, KD 40)
  • customer retention metrics (mid-tail, KD 22)
  • customer retention rate formula (long-tail, KD 12)
  • what is good customer retention rate (long-tail, KD 8)
  • ... (typically 30-150 more variations)

The right output isn’t one article per query — it’s one comprehensive article that covers all of them, structured so each sub-topic has a clear section.

Stage 5 — Prioritisation

With clusters built and intent / difficulty scored, prioritise the production order. Three-axis matrix:

Priority tierTier 1 — write first
IntentCommercial-investigation or transactional
Difficulty (KD)Within current site capability (KD 0-30 for new, +20 KD per year)
Cluster fitIn existing or top 3 planned clusters
Priority tierTier 2 — write soon
IntentInformational with strong commercial relevance
Difficulty (KD)Within capability or +5 KD aspirational
Cluster fitIn planned clusters that build topical authority for tier 1
Priority tierTier 3 — write later
IntentPure informational, broader awareness
Difficulty (KD)Aspirational difficulty
Cluster fitAdjacent clusters or brand-building
Priority tierTier 4 — skip or revisit later
IntentLow intent, high difficulty, isolated topic
Difficulty (KD)Beyond capability
Cluster fitOutside cluster strategy

The compound effect: tier 1 articles drive immediate revenue. Tier 2 articles build the topical authority that lets you eventually rank for tier 3 and the harder queries you can’t hit yet. Skipping tiers 2-3 to chase only tier 1 caps long-term growth.

The 2026 additions — AI-engine query coverage

Classic keyword research targets Google. AI engine queries (ChatGPT, Claude, Perplexity) follow different patterns:

  • Longer, more conversational. “What’s the best CRM for a 5-person consultancy that needs project tracking and integrates with QuickBooks” — a query no Google user would type.
  • More multi-faceted. Buyers stack constraints into a single query that classic search would force into 3-4 separate searches.
  • Comparison-heavy. AI engines excel at comparisons; buyers ask “X vs Y vs Z” questions classic search returns generic listicles for.

Practical implication: extend keyword research to include conversational long-tail variants buyers might ask AI engines. Surface these by prompting AI engines directly: “What are the top 20 questions someone evaluating CRMs would ask before buying?” Add the answers to the keyword list.

Common keyword research mistakes

  • Targeting head terms before topical authority exists. Six months chasing “best CRM” on a brand nobody’s heard of returns nothing. Build long-tail authority first.
  • Treating volume as the only metric. A 10K-volume informational query that doesn’t convert is worse than a 200-volume transactional query that does.
  • Ignoring intent. Writing a long guide for a query whose SERP is full of product pages — your guide can’t rank against the wrong format.
  • One keyword per article. Modern pages rank for clusters. Producing 50 thin articles to chase 50 keywords loses to one comprehensive article that covers all 50.
  • Skipping the SERP analysis. The SERP tells you what content format wins, what entities Google associates with the query, what level of comprehensiveness ranks. Decisive context.
  • Forgetting the AI engine surface. Optimising only for Google misses 15-25% of B2B research traffic that now starts in ChatGPT / Claude / Perplexity.
  • Spreadsheet without a roadmap. A 2,000-line keyword sheet nobody acts on is useless. The output should be a prioritised content roadmap with owners and dates.

The bottom line

Keyword research in 2026 is five stages: discovery (cast wide), intent classification (4 buckets), difficulty scoring (KD as filter), clustering (group related queries), prioritisation (3-axis matrix). The output isn’t a keyword list — it’s a prioritised content roadmap with intent, difficulty, cluster fit, and business impact attached to every line. Skip any stage and the roadmap underperforms; do all five and content production stops being the bottleneck.

Common questions

Common questions

Quick answers to what we get asked before every trial signup.

Keyword research is the systematic process of finding what queries your buyers type into search engines and AI engines, classifying them by intent, evaluating their commercial value vs ranking difficulty, and using that to decide what content to produce next. The output is not a list of keywords — it's a content roadmap with intent, difficulty, expected traffic, and conversion potential mapped per topic.