Why your site ranks on Google but disappears in AI search
Your Google rankings are solid. Page one for several competitive terms. Organic traffic is growing month over month. Then someone on your team asks ChatGPT about your product category and your brand is nowhere in the answer.
This is not a content quality problem. It is a technical readiness problem.
AI search engines — ChatGPT, Claude, Gemini, Perplexity, and Grok — evaluate content differently than Google. They are not primarily ranking pages. They are selecting content to synthesise into answers. The technical signals they use to determine whether your content is trustworthy, crawlable, citable, and structurally coherent enough to inform an AI-generated response overlap significantly with traditional SEO signals — but they include a distinct set of additional requirements that most technical SEO checklists have never addressed.
This checklist covers all 47. It is organised into seven categories — crawlability and indexation, structured data and schema, content structure and formatting, E-E-A-T and trust signals, page experience and Core Web Vitals, internal architecture, and AI-specific optimisation. Work through each category in sequence. Every item you check off is a signal that makes your content more likely to be selected by AI engines when your buyers ask the questions your product answers.
Iriscale’s AI Optimization Q&A and Search Ranking Intelligence automate the tracking and optimisation of many of these signals — which is noted where relevant throughout the checklist.
Category 1: Crawlability and Indexation
AI search engines cannot cite content they cannot access. The foundation of AI search readiness is ensuring your content is completely and correctly crawlable.
1. Robots.txt is not blocking AI crawlers
Several AI platforms use specific crawl bots that are distinct from Googlebot. Review your robots.txt file to ensure it is not blocking:
- GPTBot (OpenAI / ChatGPT)
- ClaudeBot (Anthropic / Claude)
- Google-Extended (Google Gemini)
- PerplexityBot (Perplexity)
- Meta-ExternalAgent (Meta AI)
A robots.txt directive that blocks all bots except Googlebot will prevent AI engines from crawling your content entirely — which means you are invisible in AI search regardless of your content quality.
2. XML sitemap is current, valid, and submitted
Your XML sitemap should include every page you want indexed and cited — and exclude pages you do not want indexed. Verify it is submitted to Google Search Console, that it validates without errors, and that it reflects your current site architecture including all recently published content.
3. Sitemap last-modified dates are accurate
AI engines use last-modified timestamps in sitemaps to prioritise recrawling. Inaccurate or missing last-modified dates reduce recrawl frequency — which means content updates are slower to be reflected in AI search answers.
4. All high-value pages return a 200 status code
Audit your highest-priority pages — pillar content, comparison pages, BOFU content — for correct 200 status responses. Pages returning 404, 301 chains, or 5xx errors are either not indexed or losing link equity through redirect dilution.
5. Redirect chains are eliminated
Every redirect chain — where URL A redirects to URL B which redirects to URL C — loses authority at each hop and slows crawl efficiency. Audit for chains and replace with direct redirects. Maximum one hop between any two URLs.
6. Canonical tags are correctly implemented
Every page should have a self-referencing canonical tag — or a canonical pointing to the preferred version if duplicate or near-duplicate content exists. Incorrect canonicals cause AI engines to attribute content signals to the wrong URL, diluting the authority of your preferred pages.
7. Hreflang is correctly implemented for multilingual sites
If your site serves multiple languages or regions, hreflang tags must correctly specify language and region for every page variant. Incorrect hreflang implementation causes AI engines to serve the wrong language version in AI search answers — or to treat multilingual variants as duplicate content.
8. Crawl budget is not being wasted on low-value pages
Faceted navigation, parameter-based URLs, session IDs, and infinite scroll implementations can generate thousands of crawlable URLs that dilute crawl budget away from your high-value content. Audit for crawl budget waste and block low-value URL patterns in robots.txt or through canonical consolidation.
9. JavaScript rendering is not blocking content
Content rendered entirely through client-side JavaScript may not be accessible to all AI crawlers — particularly those that do not execute JavaScript during crawling. Audit your highest-priority pages to ensure core content is present in the HTML source, not dependent on JavaScript execution.
10. HTTPS is implemented across all pages without mixed content errors
Every page must be served over HTTPS. Mixed content errors — where HTTPS pages load HTTP resources — generate browser security warnings and reduce the trust signals that AI engines use to evaluate site credibility.
Category 2: Structured Data and Schema
Structured data is the technical vocabulary that tells AI engines — and Google — precisely what your content is about, who created it, and how it relates to the entities it describes. For AI search specifically, structured data is a critical trust and citability signal.
11. Organisation schema is implemented on the homepage
Organisation schema communicates your brand identity, contact information, social profiles, and logo to AI engines. It establishes the entity that all content on your site is attributed to — which directly affects how AI engines reference your brand in answers.
12. Article schema is implemented on all blog and Learn section posts
Article schema communicates the headline, author, date published, date modified, and publisher of every piece of content — which directly affects how AI engines attribute citations and evaluate content freshness.
13. Author schema links to author entity pages
Every article should include author schema that links to a dedicated author entity page. Author entity pages communicate the author’s credentials, expertise, and professional history — which is a primary E-E-A-T signal for AI engines evaluating whether a piece of content is authoritative enough to cite.
14. FAQ schema is implemented on all FAQ sections
FAQ schema marks up question-and-answer content in a format that AI engines can directly extract and use to answer user queries. Every article with a FAQ section — as a default in all Iriscale Learn content — should have FAQ schema implemented.
15. HowTo schema is implemented on process and tutorial content
How-to content with structured steps is a high-value AI citation format. HowTo schema communicates the step sequence, required tools, and expected outcome — making the content machine-readable in the format AI engines prefer for instructional answers.
16. Product schema is implemented on product and feature pages
Product schema communicates your product name, description, pricing, and review data — establishing your product as a named entity that AI engines can reference in comparative and recommendation answers.
17. BreadcrumbList schema is implemented sitewide
Breadcrumb schema communicates your site’s navigational hierarchy to AI engines — helping them understand the topical context of each page within your content architecture. It also improves the visual presentation of your results in traditional search.
18. SoftwareApplication schema is implemented for SaaS product pages
SoftwareApplication schema specifically communicates that your product is a software application — including its category, operating system compatibility, and pricing — which places it in the correct product category for AI search recommendation answers.
19. Schema markup validates without errors
Validate all schema implementations using Google’s Rich Results Test and Schema.org’s validator. Schema with syntax errors is ignored by AI engines — which means the structured data investment produces no citability benefit.
20. No conflicting schema types on the same page
Multiple conflicting schema types on the same page — for example, an Article and a Product schema that describe different entities — create ambiguity that reduces the effectiveness of both. Each page should have a primary schema type that reflects its primary content purpose.
Category 3: Content Structure and Formatting
AI engines are parsing machines. They extract answers from content by identifying structural patterns — headings, paragraphs, lists, tables, and direct Q&A formatting. Content that is structurally clear is significantly more likely to be selected as a citation source than content that buries answers in long, unstructured prose.
21. Every article has a clear, specific H1 that matches the primary query intent
The H1 is the first signal an AI engine reads to determine what question a piece of content answers. It should be a direct, specific statement of the question or topic the article addresses — not a creative headline that requires interpretation.
22. H2 and H3 headings are formatted as questions or direct topic statements
AI engines extract section-level answers by matching heading text to user query patterns. Headings formatted as questions — “What is topical authority?” — or as direct topic statements — “How to build topical authority in B2B SaaS” — are significantly more likely to be matched to user queries than headings formatted as creative titles.
23. Key answers appear in the first 100 words after each heading
AI engines evaluate content by looking for the answer immediately following the relevant heading. Content that buries the answer in three paragraphs of context before stating it directly is less likely to be selected than content that answers the question in the first one to two sentences after the heading.
24. Numbered and bulleted lists are used for enumerable content
Lists are the AI engine’s preferred format for answers to “what are the,” “how many,” and “which” questions. Content that presents enumerable items in paragraph form is harder for AI engines to parse and less likely to be selected than the same content presented in a structured list.
25. Tables are used for comparison and specification content
Comparison content — tool features, pricing tiers, before-and-after scenarios — presented in properly formatted HTML tables is directly extractable by AI engines for comparison answer formats. The same content presented in prose requires AI engines to do additional parsing work and reduces citation likelihood.
26. Content contains direct answer statements, not hedged generalities
AI engines favour content that makes direct, specific, verifiable statements — not content that hedges every claim with qualifiers. “Iriscale tracks brand visibility across ChatGPT, Claude, Gemini, Perplexity, and Grok” is citable. “Iriscale may help with some aspects of AI search visibility depending on your configuration” is not.
27. Word count is appropriate for query complexity
AI engines do not favour longer content categorically. They favour content where depth is proportional to query complexity. A simple definitional question is best answered in 150 to 300 words. A complex process question warrants 1,500 to 3,000 words. Content that is under-developed for complex queries or padded for simple queries both reduce citation likelihood.
28. Images have descriptive, keyword-relevant alt text
Alt text communicates image content to AI engines that do not process images directly. Every content image — particularly diagrams, charts, and screenshots — should have alt text that describes the content and its relevance to the surrounding topic.
29. Content is free of factual inaccuracies and outdated statistics
AI engines that cite your content are putting their credibility behind your accuracy. Content with outdated statistics, incorrect claims, or factual inaccuracies is actively deprioritised as a citation source. Audit high-performing content quarterly for factual currency.
30. Code blocks are properly formatted for technical content
Technical content that includes code, command-line instructions, or configuration examples should use properly formatted code blocks — not inline code or plain text formatting. Correctly formatted code blocks are extractable by AI engines for technical answer formats.
Category 4: E-E-A-T and Trust Signals
Experience, Expertise, Authoritativeness, and Trustworthiness are the quality dimensions that both Google and AI engines use to evaluate whether a piece of content is credible enough to surface to users. For AI search specifically, E-E-A-T signals determine whether your content is trustworthy enough to cite.
31. Every author has a dedicated author entity page
Author entity pages communicate the author’s credentials, professional history, relevant expertise, and external profile links — LinkedIn, publication credits, speaking history. They are the primary mechanism for establishing that content is produced by a qualified human expert rather than an anonymous AI generator.
32. Author bios on articles link to author entity pages
The link between an article’s author attribution and the author entity page creates an explicit entity relationship that AI engines use to evaluate content credibility. Missing this link breaks the trust chain.
33. About page communicates company expertise and history
A well-developed About page that communicates your company’s founding story, domain expertise, team credentials, and mission establishes organisational E-E-A-T that supports every piece of content on your domain.
34. External links point to authoritative, relevant sources
Outbound links to authoritative sources — academic research, industry reports, reputable publications — signal that your content is part of a credible information ecosystem. Content that never cites external sources reads as self-referential and reduces trust signals.
35. Content cites specific data with sources
Specific, sourced data — statistics, research findings, case study results — is a primary credibility signal for AI engines evaluating whether content is authoritative enough to cite. Unsourced claims are treated as opinion rather than fact.
36. Review and testimonial content is present and marked up
User-generated social proof — reviews, testimonials, case study quotes — communicates that real users have validated your product’s claims. Review schema markup makes this social proof machine-readable and reinforces product credibility signals.
37. Legal pages are complete and accessible
Privacy policy, terms of service, and cookie policy pages communicate that your site operates within established legal and ethical frameworks — a basic trust signal that AI engines use to distinguish legitimate sites from low-quality or spammy ones.
38. Contact information is complete and consistent
Consistent NAP information — name, address, phone number — across your site and across external citations is a trust signal that AI engines use to verify entity legitimacy. Inconsistent or missing contact information reduces trust scores.
Category 5: Page Experience and Core Web Vitals
Page experience signals — loading speed, interactivity, visual stability, and mobile usability — affect both Google rankings and AI search crawl priority. AI engines that experience slow or broken page loads during crawling deprioritise that content as a citation source.
39. Largest Contentful Paint (LCP) is under 2.5 seconds
LCP measures how quickly the main content of a page loads. Pages with LCP above 2.5 seconds are penalised in Google’s page experience ranking signals and are crawled less efficiently by AI bots.
40. Cumulative Layout Shift (CLS) is under 0.1
CLS measures visual stability — how much the page layout shifts during loading. High CLS scores indicate a poor user experience and reduce page experience signals that both Google and AI engines evaluate.
41. Interaction to Next Paint (INP) is under 200 milliseconds
INP measures interactivity — how quickly the page responds to user input. Pages with poor INP scores have degraded page experience signals that reduce ranking and crawl priority.
42. Mobile usability has no errors in Google Search Console
AI engines that use Google’s crawl data inherit its mobile usability assessments. Pages with mobile usability errors — text too small, clickable elements too close, content wider than screen — are flagged as poor-quality experiences and deprioritised.
43. Page speed is optimised for server response time
Server response time — the time to first byte (TTFB) — directly affects crawl efficiency. AI bots that encounter slow server responses may timeout and fail to crawl content entirely. Target TTFB under 800 milliseconds.
Category 6: Internal Architecture
Internal architecture signals communicate the topical relationships between pages — telling AI engines which content is most authoritative on each topic and how individual pieces of content relate to the broader topic clusters they belong to.
44. Pillar pages have the highest internal link volume pointing to them
Internal links signal content authority. Your pillar pages — the most comprehensive, authoritative pieces on each core topic — should receive the highest volume of internal links from cluster articles. This concentrates authority signals on the pages you most want AI engines to cite.
45. Cluster articles link back to their pillar page
Every cluster article should include at least one internal link pointing to the pillar page it supports. This bidirectional linking structure explicitly communicates the topical hierarchy to AI engines.
46. Orphan pages have zero internal links pointing to them
Orphan pages — published content with no internal links pointing to it — receive no internal authority signals and are rarely discovered or recrawled by AI engines. Audit for orphan pages monthly and add relevant internal links from related content.
47. Site depth does not exceed three clicks from the homepage
Content that requires more than three clicks to reach from the homepage is treated as lower priority by both Google and AI engine crawlers. High-value content that is buried deep in site architecture should be surfaced through navigational restructuring, internal linking, or sitemap prioritisation.
How Iriscale monitors and maintains AI search readiness
Working through a 47-item technical checklist is a one-time audit. Maintaining AI search readiness as your site grows — new content published, competitor landscape shifting, AI engine evaluation criteria evolving — requires a continuous monitoring system.
Iriscale’s AI Optimization Q&A reviews every piece of content before publishing against the content structure and formatting criteria in Categories 3 and 4 — ensuring that every article published meets the structural requirements for AI search citation readiness without a manual pre-publication audit.
Iriscale’s Search Ranking Intelligence tracks whether your content is appearing in ChatGPT, Claude, Gemini, Perplexity, and Grok answers after publication — closing the feedback loop between technical optimisation and actual AI search visibility. When a piece of content that passes the technical checklist is still not appearing in AI search answers, Search Ranking Intelligence surfaces the gap so it can be investigated and addressed.
Iriscale’s Content Architecture ensures that every new piece of content published maintains the internal architecture requirements in Category 6 — mapping internal linking relationships between pillar and cluster content automatically and flagging orphan pages before they accumulate.
Is Iriscale right for your team?
Iriscale is built for B2B SaaS marketing teams at the 50–500 employee stage who need to build AI search visibility alongside traditional organic search — without managing a separate technical audit process for each channel.
If your content ranks on Google but is invisible in AI search answers, if your technical SEO process was built for Google and has never been audited for AI search readiness, if you have no visibility into whether your structured data is correctly communicating your brand entity to AI engines, or if you are publishing content without a systematic AI search optimisation step — Iriscale was built for exactly this.
Book a 30-minute walkthrough and see Iriscale’s AI search readiness tools working on your actual site, your actual content architecture, and your actual competitive landscape.
Frequently Asked Questions
What is AI search readiness and why does it matter in 2026?
AI search readiness is the degree to which your site’s technical foundation, content structure, and trust signals meet the evaluation criteria that AI search engines — ChatGPT, Claude, Gemini, Perplexity, and Grok — use when selecting content to cite in AI-generated answers. As B2B buyers increasingly use AI engines to research software purchases, visibility in AI-generated answers has become a meaningful discovery channel. A site that ranks on Google but fails AI search readiness checks is invisible to a growing segment of its addressable buyer audience.
What is the most important technical SEO change for AI search readiness?
Ensuring AI crawler bots are not blocked in your robots.txt file is the most critical and most commonly overlooked AI search readiness issue. Several B2B SaaS sites that have strong Google rankings are blocking GPTBot, ClaudeBot, and other AI crawlers — which means their content cannot be indexed by AI engines regardless of its quality. Audit your robots.txt file for AI crawler blocks before any other technical change.
How does schema markup affect AI search visibility?
Schema markup communicates your content’s structure, authorship, and entity relationships to AI engines in machine-readable format. FAQ schema makes question-and-answer content directly extractable for AI-generated answers. Article schema establishes content authorship and freshness signals. Organisation schema establishes your brand as a verified entity that AI engines can reference in answers. Sites with comprehensive, error-free schema implementation are cited more frequently in AI search answers than sites with no or incorrect schema.
What is the difference between traditional technical SEO and AI search technical SEO?
Traditional technical SEO focuses primarily on crawlability, indexation, page speed, and link signals — the factors that determine how Google discovers, evaluates, and ranks pages. AI search technical SEO encompasses all of these factors plus additional dimensions: AI crawler bot permissions in robots.txt, structured data that enables machine-readable content extraction, content formatting that supports direct answer synthesis, E-E-A-T signals that establish content trustworthiness for citation, and internal architecture that communicates topical authority hierarchies. A site that passes traditional technical SEO checks may still fail multiple AI search readiness criteria.
How does content structure affect AI search citation likelihood?
AI engines select content to cite by matching user query patterns to content structures that contain direct, extractable answers. Content with question-formatted H2 and H3 headings, direct answer statements in the first one to two sentences after each heading, and enumerable content in structured list or table format is significantly more likely to be selected as a citation source than content with the same information presented in unstructured prose. The structural formatting signals that make content easy for a human to skim are the same signals that make content easy for an AI engine to parse.
What is E-E-A-T and why do AI engines use it?
E-E-A-T — Experience, Expertise, Authoritativeness, and Trustworthiness — is the quality framework Google uses to evaluate whether content is credible enough to surface to users. AI engines use similar quality signals because they need to determine whether content is trustworthy enough to cite in answers that carry their own credibility. Content from authors with demonstrated expertise, published on domains with established authority, citing specific sourced data, and supported by user-generated social proof scores higher on E-E-A-T dimensions — and is correspondingly more likely to be selected as an AI search citation source.
How does Iriscale’s AI Optimization Q&A feature work?
Iriscale’s AI Optimization Q&A reviews every piece of content before publishing against the content structure, formatting, and E-E-A-T criteria that AI engines use when selecting citation sources. It identifies specific structural changes — heading reformatting, answer placement, list conversion, schema requirements — that would improve the content’s AI search citation likelihood. This pre-publication review step ensures that every article published through Iriscale’s Articles Hub meets AI search readiness standards without a separate manual audit process.
How do I know if my content is appearing in AI search answers?
Iriscale’s Search Ranking Intelligence tracks whether your brand and your published content are appearing in answers generated by ChatGPT, Claude, Gemini, Perplexity, and Grok for your target queries. It provides a continuous visibility signal across all five major AI engines — showing which content is being cited, which queries are triggering brand mentions, and where visibility gaps exist. This tracking is not available through any traditional SEO platform — it requires a purpose-built AI search monitoring layer.
Related reading
- Stop Buying SEO Tools, Build Marketing Intelligence
- Keyword Research for B2B SaaS: High-Intent Keywords
- 10 Proven Strategies to Boost Organic Traffic in 2026
- High Impressions, Low Clicks in Google Search Console: Why It Happens and How to Fix It
- AI Content Optimization vs. Traditional Methods: Which Is Better?
© 2026 Iriscale · iriscale.com · AI-Powered Growth Marketing for B2B SaaS