Technical GEO: Is Your Website Ready for AI Crawlers?
Technical GEO audits your robots.txt for AI bot access, measures content freshness and citation risk, calculates a citability score, and verifies schema markup and SSR.
In this article
Robots.txt Auditing for AI Bots
The first thing Technical GEO checks is your robots.txt file. This file controls which bots can crawl your website — and many sites are unknowingly blocking AI crawlers.
AI crawlers MentionLayer checks for:
| Bot | Company | Used By |
|---|---|---|
| GPTBot | OpenAI | ChatGPT, Microsoft Copilot |
| ClaudeBot | Anthropic | Claude |
| PerplexityBot | Perplexity | Perplexity.ai |
| Google-Extended | Gemini | |
| Bytespider | ByteDance | TikTok AI |
| CCBot | Common Crawl | Training data for many models |
| Amazonbot | Amazon | Alexa AI |
Common issues found: - Blanket disallow — "Disallow: /" for all bots blocks AI crawlers too - Overly restrictive rules — Blocking entire directories that contain valuable content - Stale rules — Old robots.txt that doesn't account for new AI crawlers - No explicit allow — Some AI bots require explicit "Allow" rules to crawl
MentionLayer shows you exactly which bots are allowed and which are blocked, with specific line numbers in your robots.txt. It generates a recommended robots.txt that allows AI crawlers while maintaining any legitimate blocks you have.
Content Freshness & Citation Risk
AI models prefer fresh, authoritative content. Technical GEO analyzes your key pages for freshness signals.
What gets checked: - Last-modified dates on key pages (homepage, product pages, about page) - Publish dates on blog posts and articles - Content staleness indicators — References to past years, outdated statistics, deprecated features - Update velocity — How often you publish or update content
Citation risk assessment: Pages with outdated content are a citation risk. If an AI model cites your page that says "in 2023, the market is expected to grow..." — that makes your brand look stale. Technical GEO flags pages with the highest citation risk so you can prioritize updates.
Content freshness score factors: - Average page age across key pages - Percentage of pages updated in the last 90 days - Blog publishing frequency - Presence of "last updated" dates visible on pages
Citability Scoring Explained
Citability measures how easy it is for AI models to extract specific, citable information from your website.
High-citability content has: - Clear H2/H3 heading structure (AI models use headings to navigate) - Specific data points and statistics (numbers are more citable than generalities) - FAQ sections with concise answers (AI models love Q&A format) - Lists and comparison tables (structured data is easier to parse) - Author attribution and publication dates (trust signals) - Proper schema markup (helps AI understand content structure)
Low-citability content has: - Wall-of-text paragraphs with no headings - Vague marketing copy with no specifics - Content behind JavaScript rendering that AI crawlers can't access - Missing meta descriptions and structured data - No FAQ or knowledge-base style content
The citability score (0-100) is based on: - Heading structure quality (20%) - Specific, quotable statements per page (25%) - FAQ/structured content presence (20%) - Schema markup completeness (15%) - SSR/crawlability (10%) - Content depth and originality (10%)
MentionLayer provides page-by-page citability scores with specific recommendations for improvement.
Schema & SSR Verification
Technical GEO verifies that your website's technical foundation supports AI crawling and understanding.
Schema markup check: MentionLayer scans your website for JSON-LD structured data and reports which types are present vs missing. Key schema types for AI visibility: - Organization — Your company identity - Product / Service — What you sell - FAQ — Common questions (highly cited by AI models) - Article — Blog posts and content pieces - BreadcrumbList — Site navigation structure - Review — Customer reviews on your site - HowTo — Step-by-step guides
Server-Side Rendering (SSR) check: AI crawlers often can't execute JavaScript. If your content is client-side rendered (React SPA without SSR), AI bots may see a blank page. Technical GEO tests your key pages from a bot's perspective and flags any pages that require JavaScript to render content.
Recommendations generated: - Missing schema types with ready-to-use JSON-LD code - SSR configuration suggestions for your framework - Meta tag improvements for better AI understanding - Canonical URL and sitemap optimization tips
Frequently Asked Questions
What does the Technical GEO scan check?
Three things: (1) Your robots.txt file — which AI crawlers are allowed and which are blocked, (2) Content freshness — how recently your key pages were updated and whether stale content is hurting citability, (3) Citability score — a composite measure of how easy it is for AI models to extract and cite information from your website.
Why should I unblock AI crawlers in robots.txt?
If GPTBot, ClaudeBot, or PerplexityBot are blocked in your robots.txt, those AI models literally cannot read your website content. They'll rely entirely on third-party sources (forums, press, reviews) to learn about your brand — which may be incomplete or inaccurate. Unblocking AI crawlers lets them access your authoritative content directly.
What AI crawlers exist?
The main ones are GPTBot (OpenAI/ChatGPT), ClaudeBot (Anthropic/Claude), PerplexityBot (Perplexity), Google-Extended (Gemini), Bytespider (TikTok), CCBot (Common Crawl), and Amazonbot. MentionLayer checks for all of these in your robots.txt.
What is a citability score?
Citability measures how easy it is for AI models to extract useful, citable facts from your website. High citability means your content has clear headings, specific data points, FAQ sections, and structured markup. Low citability means AI models struggle to pull concrete information from your pages.
Next Steps
Entity Sync
Entity Sync scans your brand's presence across platforms AI models reference, identifies inconsistencies, and helps you build a canonical brand identity that AI can trust.
The AI Visibility Audit
The AI Visibility Audit scans 5 pillars — Citations, AI Presence, Entities, Reviews, and Press — to produce a composite score and prioritized action plan.
Mention Gap Analysis
The Mention Gap Analyzer scans Reddit, Quora, YouTube, LinkedIn, G2, and more to find every place competitors are mentioned and you're not — then tells you how to close the gaps.