Intro
In 2025, generative search finally crossed a threshold. It stopped being an experiment and became the primary way hundreds of millions of people interact with information.
To understand how this shift changes discovery, we conducted one of the largest independent GEO research efforts to date:
10,000 generative answers analyzed across 7 major engines across 4 months across 5 query categories across 100+ brands.
This article summarizes the most important insights — what generative engines do, how they choose sources, what patterns emerged, which brands win or lose, and what this means for the future of optimization.
This is the definitive “state of generative answers” report for 2025.
Part 1: The Project Overview — What We Tested
Across 10,000 generative answers, we tracked:
-
inclusion frequency
-
citation patterns
-
reasoning behavior
-
hallucination types
-
fact drift over time
-
generative bias
-
multi-modal influence
-
answer structures
-
entity classification
-
category-level dominance
Queries came from 5 groups:
1. Informational
Definitions, how-tos, explanations, facts.
2. Transactional
Comparisons, product choices, service providers.
3. Brand-Level
“What is X?”, “Who owns X?”, “X vs Y.”
4. Multi-Modal
Images, screenshots, charts, videos.
5. Agentic
Multi-step workflows, research instructions, tool-use queries.
Engines included:
-
Google SGE
-
Bing Copilot
-
ChatGPT Search
-
Perplexity
-
Claude Search
-
Brave Summaries
-
You.com
This dataset is the clearest snapshot yet of how AI answers are being constructed in the wild.
Part 2: The 10 Most Important Findings (Summary)
Here are the top takeaways before we dive deep:
1. Generative answers are written using very few sources — typically 3–10.
2. Entity clarity was the strongest predictor of inclusion.
3. Original data was cited far more often than any other content.
4. Outdated pages were excluded almost universally.
5. Canonical definitions shaped how brands were described.
6. Multi-modal assets influenced which brands were selected.
7. Hallucinations decreased, but misclassification increased.
8. Cross-web consistency strongly influenced trust scoring.
9. Agents modified answers based on multi-step reasoning.
10. SERP-first SEO factors barely predicted generative visibility.
Let’s break down the details.
Part 3: Finding #1 — Models Use Far Fewer Sources Than Expected
Despite retrieving dozens or hundreds of pages:
Generative answers are typically built from 3–10 selected sources.
This is consistent across:
-
short answers
-
long explanations
-
comparisons
-
multi-step reasoning
-
agentic workflows
If you aren’t one of the 3–10 sources that survive filtering, you are invisible.
The All-in-One Platform for Effective SEO
Behind every successful business is a strong SEO campaign. But with countless optimization tools and techniques out there to choose from, it can be hard to know where to start. Well, fear no more, cause I've got just the thing to help. Presenting the Ranktracker all-in-one platform for effective SEO
We have finally opened registration to Ranktracker absolutely free!
Create a free accountOr Sign in using your credentials
This is the biggest shift from the SERP era:
Visibility ≠ ranking. Visibility = inclusion.
Part 4: Finding #2 — Entity Clarity Was the Strongest Predictor of Visibility
The brands with the best visibility across engines shared one universal trait:
AI could answer “What is this?” with perfect confidence.
We observed three levels of entity clarity:
Level 1 — Crystal clear Consistent, unambiguous, canonical. These brands dominated generative visibility.
Level 2 — Partially clear Some inconsistencies. These brands appeared occasionally.
Level 3 — Ambiguous Conflicting descriptions. These brands were almost completely excluded.
Entity clarity beats:
-
backlinks
-
domain rating
-
content length
-
keyword density
-
domain age
It is the #1 GEO factor across our entire dataset.
Part 5: Finding #3 — Original Data Outperformed All Other Content Types
Generative engines overwhelmingly favored:
-
proprietary studies
-
statistics
-
benchmarks
-
whitepapers
-
research reports
-
survey findings
Any content that existed nowhere else.
The All-in-One Platform for Effective SEO
Behind every successful business is a strong SEO campaign. But with countless optimization tools and techniques out there to choose from, it can be hard to know where to start. Well, fear no more, cause I've got just the thing to help. Presenting the Ranktracker all-in-one platform for effective SEO
We have finally opened registration to Ranktracker absolutely free!
Create a free accountOr Sign in using your credentials
Brands with original data had:
-
3–4× higher inclusion rates
-
5× more stable citations
-
near-zero hallucination risk
The engines want first-source evidence, not rewritten SEO content.
Part 6: Finding #4 — Recency Was More Important Than Authority
This was surprising even to us:
Engines consistently downranked outdated pages even if they came from high-authority domains.
Recency mattered enormously.
A page updated in the last 90 days outperformed:
-
higher DR competitors
-
longer content
-
more linked pages
-
older evergreen guides
Models interpret recency = credibility.
Part 7: Finding #5 — Canonical Definitions Shape How AI Describes You
We observed a direct relationship between:
-
the format of a brand’s canonical page
-
the wording used in generative summaries
Simple, structured definitions reliably showed up in answers verbatim.
This means:
You can shape how the generative web describes you —
by shaping your canonical definitions.
This is the new “snippet optimization.”
Part 8: Finding #6 — Multi-Modal Assets Played an Unexpected Role
Generative engines increasingly used:
-
screenshots
-
UI examples
-
product images
-
diagrams
-
videos
as supporting evidence.
Brands with:
-
consistent design
-
well-lit images
-
annotated visuals
-
video demos
appeared more often and were described more accurately.
Visual clarity = generative clarity.
Part 9: Finding #7 — Hallucinations Are Down, But Misclassification Is Up
Hallucinations dropped significantly across engines.
But a new problem emerged:
Misclassification — AI placing brands in the wrong category.
Examples:
-
calling a SaaS platform a “tool” instead of a “suite”
-
misidentifying product tiers
-
mixing up competitors
-
merging two brands’ features
-
confusing the parent company with the product
These errors almost always traced back to:
-
weak canonical data
-
inconsistent product naming
-
outdated support pages
Brands that updated definitions monthly had significantly lower misclassification rates.
Part 10: Finding #8 — Cross-Web Consistency Weighted Heavily in Selection
Engines checked:
-
LinkedIn
-
Wikipedia
-
Wikidata
-
Crunchbase
-
G2
-
GitHub
-
social profiles
-
schema
-
third-party reviews
against each other.
If facts matched → trust increased. If facts conflicted → exclusion happened.
Cross-web consistency was a top-5 ranking factor.
Part 11: Finding #9 — Agentic Reasoning Boosted Some Brands and Hurt Others
Agentic queries are multi-step instructions:
“Research X, compare providers, summarize options, and recommend the best one.”
We observed:
Brands with strong structured comparisons were chosen more often.
Engines wanted:
-
pros & cons
-
transparent pricing
-
clear positioning
-
use-case lists
-
feature breakdowns
Brands that hid weaknesses or obscured features lost inclusion.
Part 12: Finding #10 — SEO Strength Did Not Predict Generative Visibility
This is the clearest finding of all:
High-ranking SEO brands often performed poorly in generative answers.
Why?
Because generative visibility depends on:
-
clarity
-
consistency
-
authority
-
recency
-
originality
-
trustworthiness
-
structured data
—not on keyword rankings.
We saw brands with:
-
DR 20 outperform DR 80
-
100-page sites outperform 10,000-page sites
-
focused domains outperform broad ones
Generative engines reward coherence, not volume.
Part 13: Secondary Findings Worth Noting
Beyond the top 10 insights, we found several additional patterns:
1. Engines penalize ambiguous product ecosystems
If you have too many overlapping products, clarity collapses.
2. Long paragraphs performed poorly
Structured content was consistently preferred.
3. Models reward “definition-first” content
Start with the answer → then expand.
4. Models dislike outdated screenshots
Old UI confused multi-modal recognition.
5. Engines prefer distinct brands over brand families
Parent/child relationships often got blurred or merged.
6. Engines heavily downranked affiliate sites
Lack of originality = exclusion.
7. Domain authority only mattered for trust, not inclusion
It was one signal, not the determining one.
Part 14: Industry-Level Insights From 10,000 Answers
Strongest generative visibility
-
SaaS
-
finance
-
health information
-
cybersecurity
-
analytics
-
developer tools
These industries had clear definitions and structured documentation.
Weakest
-
hospitality
-
travel
-
home services
-
creative agencies
-
local service providers
These industries suffered from vagueness and inconsistent naming.
Part 15: What Brands Can Do With These Insights (Action-Oriented Summary)
1. Strengthen your canonical definitions
This shapes how AI describes you.
2. Publish original research
This multiplies generative visibility.
3. Maintain strict cross-web consistency
This boosts trust and inclusion.
4. Update core pages monthly
Recency is not optional.
5. Create comparison-friendly content
Agents love structured breakdowns.
6. Maintain multi-modal alignment
Your images, screenshots, and UI matter now.
7. Eliminate contradictions
AI punishes ambiguity more than search engines do.
8. Prioritize entity clarity above all
This is the foundation of GEO.
Conclusion: Generative Answers Reveal a New Information Economy
The data across 10,000 generative answers confirms one thing:
The All-in-One Platform for Effective SEO
Behind every successful business is a strong SEO campaign. But with countless optimization tools and techniques out there to choose from, it can be hard to know where to start. Well, fear no more, cause I've got just the thing to help. Presenting the Ranktracker all-in-one platform for effective SEO
We have finally opened registration to Ranktracker absolutely free!
Create a free accountOr Sign in using your credentials
We are entering an answer economy — not a link economy.
Visibility no longer depends on:
-
rankings
-
backlinks
-
keyword volume
-
SERP surfaces
It depends on:
-
clarity
-
facts
-
structure
-
recency
-
originality
-
entity coherence
-
multi-modal understanding
-
consistent cross-web identities
Generative engines don’t reward the biggest sites. They reward the clearest, most trustworthy, and most structured.
What we learned from 10,000 generative answers in 2025 is simple:
If you want visibility in the age of AI, you must optimize for how AI thinks —not how humans used to click.

