Content SEO
Chapter 01 / 07
Keyword research
How modern keyword research actually works in 2026 — discovery, intent classification, difficulty scoring, clustering, and the prioritisation framework that decides what to write first.

Keyword research is the bridge between buyer behaviour and content strategy. Done well, it converts the messy reality of how buyers search into a prioritised list of what to publish next, in what order, with what intent satisfied. Done badly, it produces a spreadsheet nobody acts on or — worse — a content calendar full of articles that target keywords nobody buying ever types.
This article walks through the modern process: discovery, intent classification, difficulty scoring, clustering, and prioritisation. It opens the Content SEO cluster because every other article in the cluster depends on getting this stage right.
“The output of keyword research isn’t a list of keywords. It’s a content roadmap, ordered by what to publish first, with intent, difficulty, and expected business impact attached to every line.”
Stage 1 — Discovery
The goal of discovery is to surface every query that’s plausibly relevant to your business — including ones the team hasn’t thought of. Five sources to combine:
- Seed keywords — the obvious 10–30 queries your team would type. Brand name, product name, category, biggest features, primary use cases.
- Competitor analysis — through Ahrefs / Semrush / Sistrix, pull the queries each major competitor ranks for. Filter to ones you don’t already rank for.
- Search Console queries — the queries that drove impressions to your existing pages, including ones you didn’t target. Often surfaces unexpected demand.
- AI engine prompts — ask ChatGPT / Claude / Perplexity what questions buyers in your space ask. The conversational queries surface natural-language phrasings classic tools miss.
- Customer language — sales call transcripts, support tickets, review sites, Reddit threads, community forums. The language buyers actually use, not the language internal teams use.
Aim for 500–2,000 candidate queries on the first pass. Volume matters — narrow filters work better when there’s plenty to filter from.
Stage 2 — Intent classification
Every query carries one of four intents. The intent decides what content format wins, not the keyword itself.
| Intent | Buyer mindset | Winning content format |
|---|---|---|
| Informational | Researching the problem; not ready to buy yet | Articles, guides, explanations, knowledge base |
| Commercial-investigation | Comparing options; evaluating before purchase | Comparisons (X vs Y), reviews, alternatives, ranked lists |
| Transactional | Ready to buy; needs to find the right thing | Product pages, pricing pages, demo / signup pages |
| Navigational | Looking for a specific brand or page | Brand homepage, login pages, specific feature pages |
How to classify intent
- Look at the SERP. Query the keyword in Google. The intent is whatever Google decided — if the SERP is full of comparison articles, the intent is commercial-investigation; if it’s product pages, transactional; if it’s how-to guides, informational.
- Read the query language. “What is X” / “How does X work” = informational. “X vs Y” / “Best X” / “X review” = commercial-investigation. “Buy X” / “X price” / “X discount” = transactional. Brand names alone = navigational.
- Note SERP features. AI Overviews + featured snippets dominate informational. Shopping carousels + ads dominate transactional. Comparison-style featured snippets dominate commercial-investigation.
Stage 3 — Difficulty scoring
Keyword Difficulty (KD) scores from Ahrefs / Semrush / Moz are imperfect proxies but useful as a first filter. They estimate how hard it is to rank in the top 10 for a query, based on the link profiles of currently-ranking pages.
| KD range | Realistic for... | Typical pattern |
|---|---|---|
| 0–20 | New domains, sites starting from zero | Long-tail informational; can rank in 4-12 weeks with decent content |
| 20–40 | Sites with some authority, established blogs | Mid-tail commercial-investigation; ranks in 3-9 months |
| 40–60 | Established domains with topical authority | Head-term informational, mid-volume commercial; 6-18 months |
| 60–80 | Major brands, high-authority sites | Head-term commercial, generic head terms; multi-year effort |
| 80+ | Top 1-3 sites in the niche | Brand monopolies, high-stakes commercial head terms |
KD is a starting filter, not a verdict. Real difficulty depends on:
- Your domain authority and topical authority on the theme.
- The quality of currently-ranking pages — if they’re thin, KD lies; if they’re comprehensive, KD undercounts.
- SERP features taking up space — AI Overviews, featured snippets, ads can shrink the click pie even when you rank.
- How well your content matches the actual intent.
Stage 4 — Clustering
Modern keyword research doesn’t target individual keywords — it targets clusters of related queries that one well-built page can rank for simultaneously. A single page can rank for 50–500 related queries; clustering reveals which keywords belong together.
Two clustering approaches:
- SERP clustering. Two queries are in the same cluster if they share most top-10 results. Tools (Keyword Insights, Surfer, ContentEngine) automate this.
- Semantic clustering. Two queries are in the same cluster if they cover the same topic, regardless of SERP overlap. AI-driven approach using embeddings.
SERP clustering is more conservative — keywords confirmed to share results. Semantic clustering catches near-duplicates SERP clustering misses. Both produce 80% overlap; senior operators run both and reconcile the diffs manually.
What a cluster looks like
For the cluster around “customer retention”, a single comprehensive article might rank for:
- customer retention (head term, KD 45)
- how to improve customer retention (mid-tail, KD 25)
- customer retention strategies (head term variant, KD 40)
- customer retention metrics (mid-tail, KD 22)
- customer retention rate formula (long-tail, KD 12)
- what is good customer retention rate (long-tail, KD 8)
- ... (typically 30-150 more variations)
The right output isn’t one article per query — it’s one comprehensive article that covers all of them, structured so each sub-topic has a clear section.
Stage 5 — Prioritisation
With clusters built and intent / difficulty scored, prioritise the production order. Three-axis matrix:
| Priority tier | Intent | Difficulty (KD) | Cluster fit |
|---|---|---|---|
| Tier 1 — write first | Commercial-investigation or transactional | Within current site capability (KD 0-30 for new, +20 KD per year) | In existing or top 3 planned clusters |
| Tier 2 — write soon | Informational with strong commercial relevance | Within capability or +5 KD aspirational | In planned clusters that build topical authority for tier 1 |
| Tier 3 — write later | Pure informational, broader awareness | Aspirational difficulty | Adjacent clusters or brand-building |
| Tier 4 — skip or revisit later | Low intent, high difficulty, isolated topic | Beyond capability | Outside cluster strategy |
The compound effect: tier 1 articles drive immediate revenue. Tier 2 articles build the topical authority that lets you eventually rank for tier 3 and the harder queries you can’t hit yet. Skipping tiers 2-3 to chase only tier 1 caps long-term growth.
The 2026 additions — AI-engine query coverage
Classic keyword research targets Google. AI engine queries (ChatGPT, Claude, Perplexity) follow different patterns:
- Longer, more conversational. “What’s the best CRM for a 5-person consultancy that needs project tracking and integrates with QuickBooks” — a query no Google user would type.
- More multi-faceted. Buyers stack constraints into a single query that classic search would force into 3-4 separate searches.
- Comparison-heavy. AI engines excel at comparisons; buyers ask “X vs Y vs Z” questions classic search returns generic listicles for.
Practical implication: extend keyword research to include conversational long-tail variants buyers might ask AI engines. Surface these by prompting AI engines directly: “What are the top 20 questions someone evaluating CRMs would ask before buying?” Add the answers to the keyword list.
Common keyword research mistakes
- Targeting head terms before topical authority exists. Six months chasing “best CRM” on a brand nobody’s heard of returns nothing. Build long-tail authority first.
- Treating volume as the only metric. A 10K-volume informational query that doesn’t convert is worse than a 200-volume transactional query that does.
- Ignoring intent. Writing a long guide for a query whose SERP is full of product pages — your guide can’t rank against the wrong format.
- One keyword per article. Modern pages rank for clusters. Producing 50 thin articles to chase 50 keywords loses to one comprehensive article that covers all 50.
- Skipping the SERP analysis. The SERP tells you what content format wins, what entities Google associates with the query, what level of comprehensiveness ranks. Decisive context.
- Forgetting the AI engine surface. Optimising only for Google misses 15-25% of B2B research traffic that now starts in ChatGPT / Claude / Perplexity.
- Spreadsheet without a roadmap. A 2,000-line keyword sheet nobody acts on is useless. The output should be a prioritised content roadmap with owners and dates.
The bottom line
Keyword research in 2026 is five stages: discovery (cast wide), intent classification (4 buckets), difficulty scoring (KD as filter), clustering (group related queries), prioritisation (3-axis matrix). The output isn’t a keyword list — it’s a prioritised content roadmap with intent, difficulty, cluster fit, and business impact attached to every line. Skip any stage and the roadmap underperforms; do all five and content production stops being the bottleneck.
Common questions
Common questions
Quick answers to what we get asked before every trial signup.
Keyword research is the systematic process of finding what queries your buyers type into search engines and AI engines, classifying them by intent, evaluating their commercial value vs ranking difficulty, and using that to decide what content to produce next. The output is not a list of keywords — it's a content roadmap with intent, difficulty, expected traffic, and conversion potential mapped per topic.
In this cluster