Beyond SparkToro: What Actually Predicts LLM Visibility
In January 2026, Rand Fishkin and Patrick O'Donnell ran 2,961 prompts across ChatGPT, Claude, and Google AI Overviews using hundreds of volunteers. They asked the same questions repeatedly and measured how often the same brands appeared in the same order. The probability: less than one in a hundred. For ordering specifically, closer to one in a thousand. Fishkin's conclusion was blunt: "any tool that gives a 'ranking position in AI' is full of baloney." He is right about rank. He is also only telling half the story. The same research that demolished rank-based measurement validated something else entirely: frequency of appearance is statistically consistent and measurable. That frequency signal is what the industry should have been building around all along; it maps to a channel converting at 3-11x traditional search rates, growing 155% year over year, with 80% of its traffic invisible to standard analytics.
TL;DR
SparkToro's January 2026 research proved that AI brand recommendation lists repeat less than 1% of the time, but frequency of brand appearance IS consistent and statistically measurable. This maps to AI Share of Voice: the right metric for LLM visibility. Microsoft Clarity, Adobe, and Search Engine Land independently confirm AI-referred traffic converts at 3-11x traditional search rates. SparkToro's own Q4 2025 data shows Google desktop searches fell 20% YoY in the US while AI tool usage tripled. AirOps found 85% of AI brand discovery comes from third-party content. Loamly's 2026 benchmark reveals GA4 misses over 80% of AI traffic. The 0.13-1% AI traffic figures in most analytics dashboards are the visible fraction of a channel that is growing, converting, and largely invisible to standard measurement.

What SparkToro Actually Tested
SparkToro ran 2,961 prompts across ChatGPT, Claude, and Google AI Overviews; the same brand list repeated less than 1 in 100 times.
The study used hundreds of volunteers running identical prompts in November and December 2025. The methodology matters: these were real users on real accounts with personal browsing histories and geographic variation, not sterile API calls. SparkToro measured both the composition of brand lists (which brands appeared) and the ordering (the sequence in which they appeared). Composition repeatability was under 1%. Ordering repeatability was closer to 0.1%.
AirOps' 2026 State of AI Search report reached a convergent finding through different methodology: only 30% of brands stay visible from one AI answer to the next, and just 20% remain present across five consecutive runs of the same prompt. Brands earning both citations and mentions were 40% more likely to resurface across multiple runs than citation-only brands.
The instability is real and measurable. SparkToro framed it as a warning against rank-based tools. The more consequential insight is what remained stable when individual responses did not.
Rank Is Noise. Frequency Is the Signal.
AI brand rank changes with nearly every query, but aggregate frequency of appearance is consistent and statistically measurable across repeated runs.
Fishkin's own data supports this distinction. While the specific list of brands changed between runs, the brands that appeared most frequently were consistent across the full sample. A brand showing up in 70% of runs for a given category held roughly that rate across measurement windows. The variance was in composition and order; the statistical distribution of appearances was stable.
This is the same principle behind media measurement in every other channel. Television advertisers do not track their exact position in an ad break. PR professionals do not measure their rank in a news cycle. Both track share of voice: the proportion of total category mentions their brand captures over a sustained period. That aggregate frequency metric smooths the noise that makes any individual instance unreliable. PR's measurement evolution from clipping services to layered evidence frameworks took 15 years. LLM visibility is compressing that timeline.
Sill's own data across 7,442 AI responses, 139 brands, and four platforms confirms this pattern. The median AI Share of Voice is 15 out of 100. The metric is not volatile at the aggregate level; individual responses are. The distinction matters: 55% of brands have a 10+ point SOV spread between their best and worst platform, but those spreads are stable over multi-week measurement windows. What SparkToro proved about instability at the instance level reinforces the case for frequency-based measurement at the aggregate level.
The Conversion Evidence Is No Longer Circumstantial
AI-referred traffic converts at 3-11x the rate of traditional search: sign-up conversion 1.66% vs. 0.15% for search across 1,200+ sites (Microsoft Clarity, 2025).
Three independent studies published between November 2025 and February 2026 converge on the same finding from different methodologies, sample sizes, and verticals. Microsoft Clarity analyzed over 1,200 publisher sites and found LLM traffic grew 155.6% over eight months, with sign-up conversion at 1.66% versus 0.15% for search traffic: an 11x difference. Subscription conversion followed the same pattern at 1.34% for LLM versus 0.55% for search.
Adobe's analysis of one trillion retail visits during the 2025 holiday season measured AI referral traffic surging 693% year over year. AI-referred shoppers converted 31% higher than other traffic sources, with revenue per visit up 254% and bounce rates 33% lower. These are not users casually browsing; they arrive with intent shaped by the AI recommendation.
| Source | AI Conversion | Search Baseline | Multiplier |
|---|---|---|---|
| Microsoft Clarity (sign-up) | 1.66% | 0.15% | 11x |
| Microsoft Clarity (subscription) | 1.34% | 0.55% | 2.4x |
| Search Engine Land (13 months) | ~18% | Paid, SEO, PPC | Highest source |
| Adobe (holiday retail) | +31% vs. other sources | All other referrals | 693% YoY growth |
Search Engine Land's 13-month analysis from January 2025 through February 2026 measured LLM referral conversion at approximately 18%: higher than paid shopping, SEO, or PPC. Per-platform conversion data shows the advantage is not uniform: Claude leads at 16.8%, ChatGPT at 14.2%, Perplexity at 12.4%. The platform variation is another reason single-platform LLM visibility measurement produces structurally misleading conclusions.
80% of AI Traffic Is Invisible in GA4
GA4 misses over 80% of AI-referred traffic because mobile apps strip referrer headers and users copy-paste URLs into browsers (Loamly, 2026).
Loamly's 2026 State of AI Traffic benchmark quantified what SparkToro's dark traffic research had theorized: standard analytics tools miss the vast majority of AI-referred visits. The mechanism is straightforward. Mobile AI apps do not pass referrer headers. Users who receive a recommendation in ChatGPT or Claude frequently copy the URL and paste it into a browser, stripping the referral chain entirely. The visit registers as direct traffic in GA4.
The 0.13-1% AI traffic figures that appear in most analytics dashboards represent only the visits with intact attribution. The actual volume is five to ten times higher. Previsible's State of AI Discovery report, analyzing 1.96 million LLM sessions, found AI traffic concentrates on high-intent pages: industry pages at 1.14% visible share, tool pages at 0.95%, pricing pages at 0.46%. These are not casual browsing sessions; they are visitors arriving with a specific recommendation and comparing options.
The dark traffic problem compounds every other measurement challenge. A marketing team reviewing GA4 sees AI referral traffic as a rounding error. The same team's branded search trend, product demo requests, and direct traffic spikes may all contain AI-influenced visits that are structurally invisible to standard attribution. This is why the attribution gap is a measurement infrastructure problem, not a channel performance problem. The channel is performing; the analytics stack cannot see it.
The Volume Shift SparkToro's Own Data Confirms
Google desktop searches per US user fell 20% year over year in Q4 2025 while AI tool usage nearly tripled (SparkToro/Datos, Q4 2025).
SparkToro's partnership with Datos produced the Q4 2025 State of Search report, and the findings sharpen the urgency beyond what the brand recommendation study alone conveyed. Google desktop searches per US user declined nearly 20% year over year. AI tools nearly tripled their usage share over the same period. Google still accounts for 73.7% of desktop searches across the 41 domains analyzed, but the trajectory is directional and accelerating.
Pew Research confirms the behavioral shift from the consumer side: 39% of US adults now use ChatGPT weekly for decision-making tasks. Similarweb's global tracker shows Gemini growing 643% year over year while ChatGPT has reached 883 million monthly users generating two billion queries daily. Conductor's November 2025 benchmarks found AI Overviews now appear in 25.11% of Google searches, up from 13.14% in March 2025; 87.4% of identifiable AI referral traffic comes from ChatGPT alone.
| Metric | Finding | Source |
|---|---|---|
| Google desktop searches (US) | -20% YoY per user | SparkToro/Datos Q4 2025 |
| AI tool usage share | Nearly tripled YoY | SparkToro/Datos Q4 2025 |
| Weekly ChatGPT usage (US adults) | 39% | Pew Research, March 2026 |
| Gemini traffic growth | +643% YoY | Similarweb, Feb 2026 |
| AI Overviews in Google searches | 25.11% (up from 13.14%) | Conductor, Nov 2025 |
| ChatGPT monthly users | 883M | Similarweb, Feb 2026 |
The volume is there. The growth is there. The conversion advantage is documented across independent studies. The measurement infrastructure is what has not kept pace. Fishkin himself noted on the Near Media podcast in February 2026 that while AI responses are inconsistent at the individual level, aggregate brand presence can be measured statistically. The qualifier matters: it is the difference between a broken metric and a misapplied one.
85% of Discovery Happens on Content You Do Not Control
85% of AI brand discovery comes from third-party content: listicles, comparison guides, community discussions, and niche publisher reviews (AirOps, 2026).
AirOps' 2026 report found that the overwhelming majority of AI brand visibility originates from content the brand did not create. Listicles, comparison guides, community discussions on Reddit and Quora, and niche publisher reviews drive 85% of brand discovery in AI search. This finding reframes LLM visibility strategy: optimizing your own site is necessary but nowhere near sufficient.
BrightEdge's citation analysis reinforces this from a different angle: only 17% of sources cited in AI Overviews also rank in the organic top ten. Five out of six AI citations pull from pages not on the first page of traditional search results. SE Ranking's study of 2.3 million pages found that domains with Quora and Reddit mentions have approximately four times higher chances of being cited by AI platforms. Citation volume can differ by 615x between platforms for the same domain, with Grok and Claude at opposite ends of the range.
AirOps also found that pages not updated quarterly are three times more likely to lose their AI citations, adding a temporal dimension that static SEO content does not face. Sequential headings and rich schema correlate with 2.8x higher citation rates. The brands with the highest LLM visibility have not just optimized their own properties; they have built a presence across the third-party ecosystem that AI models draw from. That ecosystem is dynamic, platform-specific, and requires continuous measurement to track.
What LLM Visibility Measurement Should Look Like Now
Effective LLM visibility measurement requires frequency-based SOV across all major platforms, branded search correlation, and content change attribution with within-brand controls.
SparkToro established what not to measure: point-in-time rank on any single platform. The research published in the three months since points toward what to measure instead. Frequency-based Share of Voice, tracked daily across every major AI platform, captures the stable signal within the noise that Fishkin's own data validated. Branded search trend in Google Search Console provides a downstream proxy for AI influence on discovery behavior. Content change attribution, using a within-brand control group of queries unaffected by the change, isolates whether specific interventions moved the needle or whether the shift was platform noise.
This is the same multi-layered evidence structure that PR, television, and podcast advertising have used for decades. None of those channels have perfect attribution. All of them justify billions in annual spend through frameworks that are rigorous without being complete. The four-metric framework for AI search ROI operationalizes this approach for marketing teams facing Q2 budget reviews.
| Metric | What It Captures | Minimum Window |
|---|---|---|
| AI Share of Voice (frequency) | Stable brand presence across platforms | 4 weeks baseline |
| Branded search trend (GSC) | Downstream discovery influence | 8-12 weeks |
| GA4 AI referral conversion | Visible portion of AI traffic quality | Ongoing (29.4% visible) |
| Content change attribution | Causal intervention effect | 4+ weeks post-change |
The brands that assemble this evidence infrastructure first will have the measurement foundation their competitors are still deferring. Forrester projects 25% of planned AI search spend pushed to 2027 for lack of proven ROI. That deferral is not a channel judgment; it is a measurement gap. The data to close it exists today. The question is whether the tools and frameworks are in place to collect it.
Measure What SparkToro Proved Matters: Frequency, Not Rank
Sill tracks AI Share of Voice daily across ChatGPT, Gemini, Perplexity, Claude, Grok, and Google AI Overviews. Frequency-based measurement across all six platforms, starting at $0/month.
References
- Fishkin, R. and O'Donnell, P. "New Research: AIs Are Highly Inconsistent When Recommending Brands or Products." SparkToro Blog, January 2026. sparktoro.com
- AirOps. "The 2026 State of AI Search." AirOps Report, 2026. airops.com
- Microsoft Clarity. "AI Traffic Converts at 3x the Rate of Other Channels." Microsoft Clarity Blog, November 2025. clarity.microsoft.com
- Adobe. "AI-Driven Traffic Surges Across Industries." Adobe Business Blog, January 2026. business.adobe.com
- Search Engine Land. "What 13 Months of Data Reveals About LLM Traffic Growth and Conversions." Search Engine Land, February 2026. searchengineland.com
- Loamly. "State of AI Traffic 2026 Benchmark Report." Loamly Blog, 2026. loamly.ai
- SparkToro and Datos. "State of Search Q4 2025." Datos Report, Q4 2025. datos.live
- Pew Research Center. "Key Findings About How Americans View Artificial Intelligence." Pew Research, March 2026. pewresearch.org
- Similarweb. "Gen AI Stats 2026." Similarweb Marketing Blog, February 2026. similarweb.com
- Conductor. "AEO/GEO Benchmarks Report." Conductor Academy, November 2025. conductor.com
- BrightEdge. "AI Search Visits Surging 2025." BrightEdge Research, September 2025. brightedge.com
- SE Ranking. "ChatGPT vs Perplexity vs Google vs Bing: Comparison Research." SE Ranking Blog, 2025. seranking.com
- Previsible. "AI SEO Study 2025: State of AI Discovery." Previsible, 2025. previsible.io
- Near Media. "EP 244: AI Visibility vs Google Rankings." Near Media Podcast, February 2026. nearmedia.co
- Am I Cited. "Conversion Rate of AI Traffic." Am I Cited FAQ, 2026. amicited.com
Get Your Report
Request your first analysis today to see where you stand.
