Six weeks ago we said Domain Authority was the strongest predictor of AI visibility. Today we have proof a signal beats it.
In the largest cross-market off-page GEO study published anywhere — 2,729 businesses, 14 industries, Los Angeles + Sydney + New York + Chicago + national, 278,000+ individual mention checks — directory presence (r=0.391) outranks Domain Authority (r=0.338).
That's the headline. But it's not the contrarian finding. The contrarian finding is what happened when we stress-tested the most-repeated piece of GEO advice on the internet: “You need to be on Reddit.”
Reddit's raw correlation with AI visibility is real — r=0.333. Strong, statistically significant, the kind of number every GEO consultant is quoting at a conference right now.
Then we added controls. Domain Authority. Google reviews. Then every other off-page signal we measured.
By the time we'd controlled for whether the brand also has Wikipedia, LinkedIn, BBB, Yelp, Crunchbase, Trustpilot, Google Business, YouTube, Quora and the rest, Reddit's independent effect collapsed to zero. r = 0.000. n = 2,545.
What Study 1 left unanswered
Six weeks ago we published the AI Visibility Index — 1,004 businesses, 10 industries, 5 AI models, 95,392 mention checks. Headline: 65.9% of businesses are completely invisible to AI. Top two predictors: Domain Authority (r=0.337) and Google review count (r=0.333).
That study answered the question “who's visible.” It did not answer the question we actually needed for our clients: what do we do about it?
DA and review count are lagging indicators. They take 12–18 months to move. If a business is invisible today and you tell them “build authority,” you're telling them to come back in a year and a half. That's not advice. That's an obituary.
So we asked the harder question:
What's inside Domain Authority? Can we decompose it into measurable, actionable, off-page signals — and prove that those signals predict AI visibility independently of DA itself?
The expansion
| Dimension | Study 1 | Study 2 |
|---|---|---|
| Businesses | 1,004 | 2,729 |
| Industries | 10 | 14 |
| Markets | 1 (LA + national) | 4 — LA + Sydney + NYC + Chicago + national |
| AI models tested | 5 | 5 |
| Prompts per slot | 20 | 20 |
| Total mention checks | 95,392 | 278,000+ |
| Industry-city slots | 10 | 32 |
For each of the 2,729 businesses we collected the Study 1 layer (Moz DA, Google reviews, schema, robots, citability, blogs, llms.txt, FAQ content) plus two new layers built specifically for this study:
- Layer 2a — SpyFu enrichment. Monthly organic clicks, organic keyword count, domain strength, 12-month organic growth rate.
- Layer 2b — Off-page presence. Twelve canonical platforms parsed via Apify SERP queries (
site:platform "brand") — Reddit, Quora, Wikipedia, LinkedIn, Crunchbase, GBP, BBB, Yelp, Trustpilot, G2, Capterra, YouTube. Plus 12-month press mentions via editorial-domain classifier.
Then we did one thing nobody else has done: we reverse-classified every URL every AI model cited in Study 1's 95,000+ responses into 16 source-type categories. So when we ask “where does Perplexity actually pull answers from?” we have receipts.
Off-page signals dethrone Domain Authority
Raw Pearson correlations vs. visibility_score across 2,648 businesses. Four off-page signals now sit ahead of Domain Authority. Six are ahead of Google review count.
Finding 1 — Off-page signals just dethroned DA
In Study 1, DA and Google review count occupied the top two slots. In Study 2 they're rank 5 and rank 14. Four off-page signals are now ahead of DA. Six are ahead of Google review count.
The strongest is directory_count — a simple sum of how many of twelve canonical platforms a business shows up on. It outranks every authority metric Moz, Ahrefs, or SpyFu sells.
Finding 2 — “You need to be on Reddit” doesn't survive scrutiny
This is the contrarian centerpiece. Read it carefully.
The single most-repeated piece of GEO advice in 2026 is some version of “AI models pull from Reddit, so you need to be on Reddit.” It's the marketing pitch behind dozens of new GEO tools. Half the SEO conferences had a panel about it. We tested it directly.
Reddit's predictive power collapses under controls
Reddit's raw correlation with AI visibility is real (r=0.333). Add controls and it falls off a cliff. By the time we control for whether the same brand also shows up on Wikipedia, LinkedIn, BBB, Yelp, Trustpilot, GBP, YouTube, Crunchbase, and Quora — Reddit's independent contribution is statistically zero.
Read the chart left to right. Reddit's predictive power doesn't degrade gracefully — it falls off a cliff.
The raw correlation of +0.333 looks like a Reddit effect. It is not. Once you control for whether the same brand also shows up on Wikipedia, LinkedIn, BBB, Yelp, Trustpilot, GBP, YouTube, Crunchbase, and Quora, Reddit's independent contribution to AI visibility is statistically zero. r = 0.000 across 2,545 businesses.
The honest read: Reddit mention count is a proxy for general multi-platform brand visibility. Brands that get mentioned on Reddit also tend to get mentioned everywhere else. The Reddit number was measuring the everywhere-else effect the whole time.
The same decomposition applied to directory_count tells the same story:
| Control level | directory_count r | n |
|---|---|---|
| Raw (no controls) | +0.391 | 2,648 |
| Controlled for DA only | +0.186 | 998 |
| +DA + Reviews | +0.154 | 899 |
| +Study 1 controls | +0.132 | 847 |
| Controlled for ALL OTHER off-page signals | +0.000 | 2,545 |
| Strictest (everything measured) | +0.000 | 795 |
No single off-page channel has independent predictive power once everything else is controlled for. AI visibility is cumulative. It's a stack, not a switch.
Finding 3 — What does survive: be the URL the AI cites
There's a related claim adjacent to the Reddit hype: “Be in the threads ChatGPT pulls up as a reference.” That one we can test, but only for the models that actually return source URLs.
5.5x lift when Perplexity cites you as a source
Probability that a brand is mentioned in a response, conditional on whether its URL was cited as a source. Only Perplexity and Google AI Overview return source URLs in their APIs. ChatGPT, Claude, and Gemini do not — making this finding untestable for those models.
For the models that do cite — Perplexity and Google AI Overview — being the cited URL within an AI response is a strong signal. 5.5x lift on Perplexity. Per-business citation count vs visibility: Perplexity r=+0.194 (n=2,729). Google AIO r=+0.086 (n=2,729). ChatGPT, Claude, Gemini: zero source URL data, zero correlation possible.
The actionable read is not “post on Reddit.” It is: be the canonical URL the AI considers the right answer for your category. Sometimes that's a Reddit thread. More often — and this is the part that matters — it's a directory page, an industry list, a review hub, a YouTube video, your own site. Where you need to be is wherever AI is already pulling from in your specific vertical.
Where AI actually pulls from (the receipts)
The discourse and the data have been disagreeing for a year. Reverse-classification of the actual URLs Perplexity and Google AIO cited in our 95,000+ Study 1 responses, by source type:
AI cites editorial blogs first, Reddit zero
Reverse-classification of every URL Perplexity and Google AIO cited in our 95,000+ Study 1 responses, into 16 source-type categories. Reddit's share of citations: 0.0%. Quora: 0.02%. The discourse and the data have been disagreeing for a year.
Editorial blogs and publications carry 69% of citations. Brand-owned websites carry another 18%. Industry directories: 6%. YouTube: 4%. News media: 2.5%. Reddit: zero. Quora: 0.02%. Two Wikipedia citations across 95,000 responses.
The top industry directories AI actually cites are vertical-specific gatekeepers — and these are the URLs to fight for:
| Domain | Citations | Vertical |
|---|---|---|
| zillow.com | 59 | Real Estate |
| justia.com | 56 | Personal Injury Law |
| angi.com | 56 | Home Services |
| zocdoc.com | 55 | Dental / Medical |
| thumbtack.com | 27 | Home Services |
| healthgrades.com | 15 | Dental / Medical |
| realtor.com | 7 | Real Estate |
| lawyers.com | 6 | Personal Injury Law |
| homeadvisor.com | 6 | Home Services |
| avvo.com | 1 | Personal Injury Law |
Finding 4 — Directories as a force multiplier
We split the sample into Domain Authority quartiles and asked: within each quartile, what's the visibility difference between businesses with above-median directory presence versus below-median?
+16 visibility points in the top DA quartile
Within each Domain Authority quartile, businesses with above-median directory presence vs. below-median. The compounding kicks in only when underlying authority is already there.
The Q4 row is the punchline. For high-authority brands, the multi-platform presence stack is a force multiplier — +16 visibility points between low- and high-directory peers in the same DA quartile.
For low-authority brands (Q1, Q2, Q3), directory presence helps a little or not at all on its own. The compounding only kicks in when you also have the underlying authority. If you're a DA 12 startup, fixing your BBB profile won't put you on ChatGPT. If you're a DA 65 brand without it, you're leaving 16 points of visibility on the table.
Finding 5 — Industry physics: 0% to 30%
The variance in off-page leverage between verticals is enormous.
Off-page leverage by industry × market
Percentage of AI citations sourced from MentionLayer-addressable surfaces (forums + directories + YouTube + review sites) per industry-city slot. Off-page intervention has roughly 26x more leverage in SaaS CRM than in Med Spa.
Off-page intervention has roughly 30x more leverage in NYC plumbing than in Med Spa. In Sydney professional services and Chicago accounting, it has no leverage — those models cite editorial blogs that an off-page program can't move.
For MentionLayer specifically, this is a sales-targeting matrix: SaaS, home services and personal-finance apps are high-leverage. Med spa and Sydney professional services are low-leverage. The product's value depends entirely on the vertical. That's an honest admission, and it makes the pitch sharper, not weaker.
Finding 6 — The dominant signal changes per market
Some of the most striking findings only show up when you stratify by industry × city.
The dominant signal changes per industry × market
Top single-signal correlations within specific industry-city slots. NYC plumbing has the strongest off-page correlation in the entire study (r=0.683). NYC accounting is Quora-dominated. LA real estate is the one Reddit-led market.
- NYC home services: off_page_composite r=0.683, directory_count r=0.679. The strongest off-page correlation in the entire study.
- NYC accounting: quora_mention_count r=0.674 — Quora is the dominant signal, not Reddit. A Quora-led playbook would arguably outperform anything else.
- LA real estate: reddit_mention_count r=0.589 — the one industry-city slot where the “post on Reddit” advice plausibly survives strict scrutiny.
- Sydney slots collectively: 0% addressable citation share. AI cites Sydney editorial blogs, not the directories or forums an off-page program touches. A Sydney dentist's GEO strategy should not look anything like an NYC dentist's.
Finding 7 — Strict isolation: what survives full controls
For every off-page signal, we ran an OLS-residual partial correlation controlling for all other measured features simultaneously. This is the cleanest per-signal effect size we can compute on an observational dataset.
No single off-page signal exceeds r=0.10
OLS-residual partial correlation for each off-page signal, controlling for every other off-page feature simultaneously (n=2,545). Not Reddit, not BBB, not Wikipedia, not LinkedIn — no individual platform is the secret. The system is the secret.
No single off-page signal exceeds r=0.10 in strict isolation. Not Reddit. Not BBB. Not Wikipedia. Not LinkedIn. Not YouTube.
The honest interpretation: AI visibility is not driven by any one channel. It is driven by cumulative multi-platform presence. Each platform contributes a small lift; together they create the visibility outcome. This is why the composite (off_page_composite_score, r=0.384) outperforms the strongest individual platform.
The self-audit: visible vs invisible profile
What does a multi-model-visible business actually look like? We profiled the 401 businesses mentioned by ≥2 AI models and the 1,841 businesses mentioned by zero. The gap is striking.
Visible vs invisible — where do you sit?
Average profile of multi-model-visible businesses (n=401) vs. invisible businesses (n=1,841). Run your own brand against these benchmarks. Visible brands carry roughly 2.3x more directories, 2.3x more Reddit mentions, and 2.5x more Wikipedia/Crunchbase coverage.
Run your own brand against these benchmarks. If you're below the visible profile on directory count, Wikipedia, Crunchbase and review platforms — that is your work order for Q3. If you're above on most metrics and still invisible, you have a Layer 1 (DA / authority) problem, not a Layer 2 problem.
The unifying thesis
Three findings combine into one defensible claim.
- 1No single off-page channel — including Reddit, the most-hyped one — has more than a small independent effect once everything else is controlled for. Reddit's strict isolated r is zero.
- 2The cumulative multi-platform presence stack drives visibility. BBB, Yelp, GBP, Wikipedia, LinkedIn, Crunchbase, Trustpilot, YouTube, Reddit, Quora — each adds a small individual lift. Together they produce the outcome.
- 3Where AI models do cite specific sources (Perplexity especially, in vertical-specific addressable categories), being THAT cited URL is a strong signal — 5.5x lift on Perplexity.
The operational consequence: any GEO program priced around a single channel is, on this evidence, mispriced. A program that builds presence across 8–12 platforms simultaneously, weighted to your vertical's actual addressable surfaces, is what the data supports. That's the version we built.
What this means for your business
Look up your vertical and your DA quartile. The right strategy is specific to both.
If your DA is in the top quartile (54+)
You are leaving roughly +16 visibility points on the table if your directory and off-page presence stack is below median. This is the highest-ROI work you can do this quarter. Audit the 12 platforms. Fix gaps in order of vertical relevance.
Run a free auditIf your DA is mid-tier (Q2–Q3)
The directory lift is modest (~+1.4 in Q3). Your bigger lever is whichever specific signals dominate your vertical-market combination. NYC accounting? Quora. LA real estate? Reddit. NYC plumbing? Directories. Look up your slot in the per-industry data.
Check your slotIf your DA is low (Q1)
The off-page stack helps a little (+2.1 in Q1) but it does not substitute for the underlying authority. You need both. Be patient on DA, work the stack in parallel, and don't pay anyone telling you off-page-alone fixes invisibility at low DA. It does not.
Read the strategy guideVertical in the 0–5% addressable bracket
Med spa, Sydney professional services, accounting outside NYC, ecommerce DTC, boutique hospitality, insurance, digital marketing. Off-page seeding is low-leverage. Your dollars belong in editorial PR, owned content, and brand search behaviour. Diagnose first.
Talk to usVertical in the 20%+ addressable bracket
SaaS CRM/PM, home services, personal-finance apps, real estate. Off-page seeding is the single highest-leverage channel available to you. Build the stack across 8–12 platforms simultaneously. This is where MentionLayer was designed to operate.
See how MentionLayer works“You're wrong about Reddit” — and other expected attacks
Publishing “Reddit doesn't do what you think it does” will earn pushback. Here are the strongest available rebuttals to the study, with our responses. We have skin in the game on this — we'd rather be corrected than wrong.
Q1Your Reddit measurement is too crude — you used SERP results, not actual Reddit data.
Q2Consumer Perplexity surfaces Reddit far more than the API. Your study is API-only.
Q3ChatGPT's API doesn't return sources, so how can you say anything about ChatGPT?
Q4Correlation is not causation. You haven't proven anything.
Q5Your sample skews toward US large markets — LA, NYC, Chicago. The 'national' slots are also US-centric.
Q6L2 regularization on correlated features creates noise. Your logistic regression is unreliable.
Q7Some Sydney slots had only 80% enrichment coverage due to a batch failure. Did you cherry-pick?
Q8What about Reddit-specific tools like Hyros, Reddit Pro Search, etc. that use proprietary signals?
Q9Why didn't you measure paid placement, sponsored content, or influencer mentions?
Methodology + reproducibility
For the people who care about how it was built. Skim if not.
- Sample. 2,729 businesses · 14 industries · 32 industry-city slots — six local categories replicated in LA + Sydney + NYC + Chicago, plus four national SaaS / professional categories carried from Study 1, plus four new national verticals.
- Models. ChatGPT (gpt-4o), Perplexity (sonar-pro), Gemini (2.5-flash), Claude (Sonnet), Google AI Overview (via SerpApi).
- Prompts. 20 unique buying-intent prompts per industry-city, six categories — direct recommendation, comparison, specific need, conversational, authority-seeking, decision. Identical to Study 1.
- Mention detection. Heuristic string matching (exact name, partial name, domain) plus AI-enhanced verification (Claude Sonnet at temperature 0).
- Off-page collection. 12 SERP queries per business via Apify
google-search-scraper, parsed for canonical-platform URLs. - Citation classifier. Every URL Perplexity and Google AIO cited in Study 1 was classified into 16 source-type categories.
The strict-isolation methodology
For each test feature X we compute an OLS-residual partial correlation against visibility_score, controlling for all other measured features simultaneously. Standardised features, residualised on full controls, Pearson on the residuals. Restricted to rows where every feature is non-null.
We've run six different model specifications — different control sets, restricted to different subsamples, with and without SpyFu/Layer 1 features. The directional findings are unchanged. Reddit's independent effect never exceeds r=0.05 in any specification we've tried.
Limitations — every one we know about
- LA bias in carry-over Study 1 sample. Sydney/NYC/Chicago expansion only added local-services industries, not national SaaS or professional verticals.
- Perplexity API skews toward editorial and own-site citations — 0% Reddit citation rate. Consumer Perplexity surfaces Reddit far more visibly. Our findings apply to API behaviour, which is what 99% of GEO measurement actually uses.
- ChatGPT API doesn't return source URLs. The “be in the threads ChatGPT cites” claim is untestable from API data.
- Reddit measurement is a SERP-based volume proxy. Doesn't capture quality (subreddit authority, upvote weight, recency). A higher-quality measurement could find a stronger isolated effect.
- Press mention count's negative coefficient is suspicious — likely a measurement artifact from the press classifier's ~10–15% false-positive rate. Treat as inconclusive.
- Correlation, not causation. All findings are observational. Layer 3 (controlled intervention, May–July 2026) will provide causal evidence.
- Some Sydney slots had ~80% enrichment coverage due to one Apify batch failure mid-run. Affected slots are flagged in the dataset and disclosed to research-access partners.
- Visibility scores are a point-in-time measurement — April 2026, five specific AI models. Re-running this study quarterly is the plan.
What's next: Layer 3 — the controlled intervention
Phase 2 is observational. Phase 3 is causal.
- 25–30 businesses · 60-day controlled intervention · launches May–July 2026.
- Two dose groups. Partial dose (15–20 client-level treatments): directory build-out, reddit/quora seeding (vertical-relevant), press distribution, review campaigns. Full dose (5–7 Joel-portfolio businesses): everything in partial dose + youtube + on-site geo build-out.
- Untouched control: 1,004-business Layer 1 sample (natural-drift comparison).
- Pre-registered success threshold: ≥ 4 of 6 metric deltas hit (pre-registered). Six metric deltas defined upfront. The bar is set before the experiment runs.
- Result published regardless of direction — including null results. The first controlled before-and-after experiment in GEO.
Run it on your brand · explore the data · cite the study
The findings, methodology, per-slot statistics, and individual brand lookup are public. The 2,729-row dataset is licensed to research partners under NDA — same posture as Pew, MIT Tech Review, Backlinko, GitHub Octoverse, and every other large-investment industry research program.
Run your brand against the benchmarks
Enter your domain. We measure your directory count, off-page composite, top vertical signal, and visible-vs-invisible position — live, against the 2,729-business benchmark.
Explore the data — by industry × market
Pick your slot. See the top 5 predictors, addressable share of citations, the top cited domains, and the visible-vs-invisible profile for that exact slot. Browse-only — no bulk export.
Apply for full dataset access
Academic researchers, press / analysts, Layer 3 trial participants, and paying MentionLayer customers can request access to the underlying 2,729-row dataset under NDA. We review every request.
Cite as: House, J. (April 2026). The Off-Page AI Visibility Index: A Q2 2026 Decomposition. MentionLayer Research. mentionlayer.com/research/q2-2026-off-page-decomposition · Press / analyst inquiries: [email protected].
Quotable summary
- Directory count just dethroned Domain Authority as the #1 predictor of AI visibility — r=0.391 vs r=0.338, n=2,648.
- We tested 'you need to be on Reddit' across 2,729 businesses. Reddit's predictive power collapses from r=0.333 to r=0.000 once you control for general multi-platform presence.
- When a brand IS the URL Perplexity cites, it's 5.5x more likely to be mentioned in the response text.
- The ChatGPT API doesn't return source URLs. Anyone confidently telling you 'this is what ChatGPT cites' is using a different data source — or making it up.
- 26x more addressable: 26.6% of SaaS CRM AI citations are MentionLayer-actionable vs 0% of Med Spa citations.
- +16 visibility points: that's the lift directory presence delivers within the top Domain Authority quartile.
- AI visibility is a SYSTEM, not a SIGNAL.
- The largest cross-market controlled GEO study published anywhere: 2,729 businesses, 14 industries, 4 markets, 278,000+ data points, 32 industry-city slots.
This is the second study in the AI Visibility Index research series. Study 1: AI Visibility Index, April 2026. Study 3 (Layer 3 controlled intervention) ships Q3 2026.
— Joel House, founder of MentionLayer + Joel House Search Media · Forbes Agency Council · Sydney + Los Angeles, April 2026.