ARTICLE

Technical SEO Checklist for AI Search Readiness

Why your site ranks on Google but disappears in AI search

Your Google rankings are solid. Page one for several competitive terms. Organic traffic is growing month over month. Then someone on your team asks ChatGPT about your product category and your brand is nowhere in the answer.

This is not a content quality problem. It is a technical readiness problem.

AI search engines — ChatGPT, Claude, Gemini, Perplexity, and Grok — evaluate content differently than Google. They are not primarily ranking pages. They are selecting content to synthesise into answers. The technical signals they use to determine whether your content is trustworthy, crawlable, citable, and structurally coherent enough to inform an AI-generated response overlap significantly with traditional SEO signals — but they include a distinct set of additional requirements that most technical SEO checklists have never addressed.

This checklist covers all 47. It is organised into seven categories — crawlability and indexation, structured data and schema, content structure and formatting, E-E-A-T and trust signals, page experience and Core Web Vitals, internal architecture, and AI-specific optimisation. Work through each category in sequence. Every item you check off is a signal that makes your content more likely to be selected by AI engines when your buyers ask the questions your product answers.

Iriscale’s AI Optimization Q&A and Search Ranking Intelligence automate the tracking and optimisation of many of these signals — which is noted where relevant throughout the checklist.

Category 1: Crawlability and Indexation

AI search engines cannot cite content they cannot access. The foundation of AI search readiness is ensuring your content is completely and correctly crawlable.

1. Robots.txt is not blocking AI crawlers

Several AI platforms use specific crawl bots that are distinct from Googlebot. Review your robots.txt file to ensure it is not blocking:

GPTBot (OpenAI / ChatGPT)
ClaudeBot (Anthropic / Claude)
Google-Extended (Google Gemini)
PerplexityBot (Perplexity)
Meta-ExternalAgent (Meta AI)

A robots.txt directive that blocks all bots except Googlebot will prevent AI engines from crawling your content entirely — which means you are invisible in AI search regardless of your content quality.

2. XML sitemap is current, valid, and submitted

Your XML sitemap should include every page you want indexed and cited — and exclude pages you do not want indexed. Verify it is submitted to Google Search Console, that it validates without errors, and that it reflects your current site architecture including all recently published content.

3. Sitemap last-modified dates are accurate

AI engines use last-modified timestamps in sitemaps to prioritise recrawling. Inaccurate or missing last-modified dates reduce recrawl frequency — which means content updates are slower to be reflected in AI search answers.

4. All high-value pages return a 200 status code

Audit your highest-priority pages — pillar content, comparison pages, BOFU content — for correct 200 status responses. Pages returning 404, 301 chains, or 5xx errors are either not indexed or losing link equity through redirect dilution.

5. Redirect chains are eliminated

Every redirect chain — where URL A redirects to URL B which redirects to URL C — loses authority at each hop and slows crawl efficiency. Audit for chains and replace with direct redirects. Maximum one hop between any two URLs.

6. Canonical tags are correctly implemented

Every page should have a self-referencing canonical tag — or a canonical pointing to the preferred version if duplicate or near-duplicate content exists. Incorrect canonicals cause AI engines to attribute content signals to the wrong URL, diluting the authority of your preferred pages.

7. Hreflang is correctly implemented for multilingual sites

If your site serves multiple languages or regions, hreflang tags must correctly specify language and region for every page variant. Incorrect hreflang implementation causes AI engines to serve the wrong language version in AI search answers — or to treat multilingual variants as duplicate content.

8. Crawl budget is not being wasted on low-value pages

Faceted navigation, parameter-based URLs, session IDs, and infinite scroll implementations can generate thousands of crawlable URLs that dilute crawl budget away from your high-value content. Audit for crawl budget waste and block low-value URL patterns in robots.txt or through canonical consolidation.

9. JavaScript rendering is not blocking content

Content rendered entirely through client-side JavaScript may not be accessible to all AI crawlers — particularly those that do not execute JavaScript during crawling. Audit your highest-priority pages to ensure core content is present in the HTML source, not dependent on JavaScript execution.

10. HTTPS is implemented across all pages without mixed content errors

Every page must be served over HTTPS. Mixed content errors — where HTTPS pages load HTTP resources — generate browser security warnings and reduce the trust signals that AI engines use to evaluate site credibility.

Category 2: Structured Data and Schema

Structured data is the technical vocabulary that tells AI engines — and Google — precisely what your content is about, who created it, and how it relates to the entities it describes. For AI search specifically, structured data is a critical trust and citability signal.

11. Organisation schema is implemented on the homepage

Organisation schema communicates your brand identity, contact information, social profiles, and logo to AI engines. It establishes the entity that all content on your site is attributed to — which directly affects how AI engines reference your brand in answers.

12. Article schema is implemented on all blog and Learn section posts

Article schema communicates the headline, author, date published, date modified, and publisher of every piece of content — which directly affects how AI engines attribute citations and evaluate content freshness.

13. Author schema links to author entity pages

Every article should include author schema that links to a dedicated author entity page. Author entity pages communicate the author’s credentials, expertise, and professional history — which is a primary E-E-A-T signal for AI engines evaluating whether a piece of content is authoritative enough to cite.

14. FAQ schema is implemented on all FAQ sections

FAQ schema marks up question-and-answer content in a format that AI engines can directly extract and use to answer user queries. Every article with a FAQ section — as a default in all Iriscale Learn content — should have FAQ schema implemented.

15. HowTo schema is implemented on process and tutorial content

How-to content with structured steps is a high-value AI citation format. HowTo schema communicates the step sequence, required tools, and expected outcome — making the content machine-readable in the format AI engines prefer for instructional answers.

16. Product schema is implemented on product and feature pages

Product schema communicates your product name, description, pricing, and review data — establishing your product as a named entity that AI engines can reference in comparative and recommendation answers.

17. BreadcrumbList schema is implemented sitewide

Breadcrumb schema communicates your site’s navigational hierarchy to AI engines — helping them understand the topical context of each page within your content architecture. It also improves the visual presentation of your results in traditional search.

18. SoftwareApplication schema is implemented for SaaS product pages

SoftwareApplication schema specifically communicates that your product is a software application — including its category, operating system compatibility, and pricing — which places it in the correct product category for AI search recommendation answers.

19. Schema markup validates without errors

Validate all schema implementations using Google’s Rich Results Test and Schema.org’s validator. Schema with syntax errors is ignored by AI engines — which means the structured data investment produces no citability benefit.

20. No conflicting schema types on the same page

Multiple conflicting schema types on the same page — for example, an Article and a Product schema that describe different entities — create ambiguity that reduces the effectiveness of both. Each page should have a primary schema type that reflects its primary content purpose.

Category 3: Content Structure and Formatting

AI engines are parsing machines. They extract answers from content by identifying structural patterns — headings, paragraphs, lists, tables, and direct Q&A formatting. Content that is structurally clear is significantly more likely to be selected as a citation source than content that buries answers in long, unstructured prose.

21. Every article has a clear, specific H1 that matches the primary query intent

The H1 is the first signal an AI engine reads to determine what question a piece of content answers. It should be a direct, specific statement of the question or topic the article addresses — not a creative headline that requires interpretation.

22. H2 and H3 headings are formatted as questions or direct topic statements

AI engines extract section-level answers by matching heading text to user query patterns. Headings formatted as questions — “What is topical authority?” — or as direct topic statements — “How to build topical authority in B2B SaaS” — are significantly more likely to be matched to user queries than headings formatted as creative titles.

23. Key answers appear in the first 100 words after each heading

AI engines evaluate content by looking for the answer immediately following the relevant heading. Content that buries the answer in three paragraphs of context before stating it directly is less likely to be selected than content that answers the question in the first one to two sentences after the heading.

24. Numbered and bulleted lists are used for enumerable content

Lists are the AI engine’s preferred format for answers to “what are the,” “how many,” and “which” questions. Content that presents enumerable items in paragraph form is harder for AI engines to parse and less likely to be selected than the same content presented in a structured list.

25. Tables are used for comparison and specification content

Comparison content — tool features, pricing tiers, before-and-after scenarios — presented in properly formatted HTML tables is directly extractable by AI engines for comparison answer formats. The same content presented in prose requires AI engines to do additional parsing work and reduces citation likelihood.

26. Content contains direct answer statements, not hedged generalities

AI engines favour content that makes direct, specific, verifiable statements — not content that hedges every claim with qualifiers. “Iriscale tracks brand visibility across ChatGPT, Claude, Gemini, Perplexity, and Grok” is citable. “Iriscale may help with some aspects of AI search visibility depending on your configuration” is not.

27. Word count is appropriate for query complexity

AI engines do not favour longer content categorically. They favour content where depth is proportional to query complexity. A simple definitional question is best answered in 150 to 300 words. A complex process question warrants 1,500 to 3,000 words. Content that is under-developed for complex queries or padded for simple queries both reduce citation likelihood.

28. Images have descriptive, keyword-relevant alt text

Alt text communicates image content to AI engines that do not process images directly. Every content image — particularly diagrams, charts, and screenshots — should have alt text that describes the content and its relevance to the surrounding topic.

29. Content is free of factual inaccuracies and outdated statistics

AI engines that cite your content are putting their credibility behind your accuracy. Content with outdated statistics, incorrect claims, or factual inaccuracies is actively deprioritised as a citation source. Audit high-performing content quarterly for factual currency.

30. Code blocks are properly formatted for technical content

Technical content that includes code, command-line instructions, or configuration examples should use properly formatted code blocks — not inline code or plain text formatting. Correctly formatted code blocks are extractable by AI engines for technical answer formats.

Category 4: E-E-A-T and Trust Signals

Experience, Expertise, Authoritativeness, and Trustworthiness are the quality dimensions that both Google and AI engines use to evaluate whether a piece of content is credible enough to surface to users. For AI search specifically, E-E-A-T signals determine whether your content is trustworthy enough to cite.

31. Every author has a dedicated author entity page

Author entity pages communicate the author’s credentials, professional history, relevant expertise, and external profile links — LinkedIn, publication credits, speaking history. They are the primary mechanism for establishing that content is produced by a qualified human expert rather than an anonymous AI generator.

32. Author bios on articles link to author entity pages

The link between an article’s author attribution and the author entity page creates an explicit entity relationship that AI engines use to evaluate content credibility. Missing this link breaks the trust chain.

33. About page communicates company expertise and history

A well-developed About page that communicates your company’s founding story, domain expertise, team credentials, and mission establishes organisational E-E-A-T that supports every piece of content on your domain.

34. External links point to authoritative, relevant sources

Outbound links to authoritative sources — academic research, industry reports, reputable publications — signal that your content is part of a credible information ecosystem. Content that never cites external sources reads as self-referential and reduces trust signals.

35. Content cites specific data with sources

Specific, sourced data — statistics, research findings, case study results — is a primary credibility signal for AI engines evaluating whether content is authoritative enough to cite. Unsourced claims are treated as opinion rather than fact.

36. Review and testimonial content is present and marked up

User-generated social proof — reviews, testimonials, case study quotes — communicates that real users have validated your product’s claims. Review schema markup makes this social proof machine-readable and reinforces product credibility signals.

37. Legal pages are complete and accessible

Privacy policy, terms of service, and cookie policy pages communicate that your site operates within established legal and ethical frameworks — a basic trust signal that AI engines use to distinguish legitimate sites from low-quality or spammy ones.

38. Contact information is complete and consistent

Consistent NAP information — name, address, phone number — across your site and across external citations is a trust signal that AI engines use to verify entity legitimacy. Inconsistent or missing contact information reduces trust scores.

Category 5: Page Experience and Core Web Vitals

Page experience signals — loading speed, interactivity, visual stability, and mobile usability — affect both Google rankings and AI search crawl priority. AI engines that experience slow or broken page loads during crawling deprioritise that content as a citation source.

39. Largest Contentful Paint (LCP) is under 2.5 seconds

LCP measures how quickly the main content of a page loads. Pages with LCP above 2.5 seconds are penalised in Google’s page experience ranking signals and are crawled less efficiently by AI bots.

40. Cumulative Layout Shift (CLS) is under 0.1

CLS measures visual stability — how much the page layout shifts during loading. High CLS scores indicate a poor user experience and reduce page experience signals that both Google and AI engines evaluate.

41. Interaction to Next Paint (INP) is under 200 milliseconds

INP measures interactivity — how quickly the page responds to user input. Pages with poor INP scores have degraded page experience signals that reduce ranking and crawl priority.

42. Mobile usability has no errors in Google Search Console

AI engines that use Google’s crawl data inherit its mobile usability assessments. Pages with mobile usability errors — text too small, clickable elements too close, content wider than screen — are flagged as poor-quality experiences and deprioritised.

43. Page speed is optimised for server response time

Server response time — the time to first byte (TTFB) — directly affects crawl efficiency. AI bots that encounter slow server responses may timeout and fail to crawl content entirely. Target TTFB under 800 milliseconds.

Category 6: Internal Architecture

Internal architecture signals communicate the topical relationships between pages — telling AI engines which content is most authoritative on each topic and how individual pieces of content relate to the broader topic clusters they belong to.

44. Pillar pages have the highest internal link volume pointing to them

Internal links signal content authority. Your pillar pages — the most comprehensive, authoritative pieces on each core topic — should receive the highest volume of internal links from cluster articles. This concentrates authority signals on the pages you most want AI engines to cite.

45. Cluster articles link back to their pillar page

Every cluster article should include at least one internal link pointing to the pillar page it supports. This bidirectional linking structure explicitly communicates the topical hierarchy to AI engines.

46. Orphan pages have zero internal links pointing to them

Orphan pages — published content with no internal links pointing to it — receive no internal authority signals and are rarely discovered or recrawled by AI engines. Audit for orphan pages monthly and add relevant internal links from related content.

47. Site depth does not exceed three clicks from the homepage

Content that requires more than three clicks to reach from the homepage is treated as lower priority by both Google and AI engine crawlers. High-value content that is buried deep in site architecture should be surfaced through navigational restructuring, internal linking, or sitemap prioritisation.

How Iriscale monitors and maintains AI search readiness

Working through a 47-item technical checklist is a one-time audit. Maintaining AI search readiness as your site grows — new content published, competitor landscape shifting, AI engine evaluation criteria evolving — requires a continuous monitoring system.

Iriscale’s AI Optimization Q&A reviews every piece of content before publishing against the content structure and formatting criteria in Categories 3 and 4 — ensuring that every article published meets the structural requirements for AI search citation readiness without a manual pre-publication audit.

Iriscale’s Search Ranking Intelligence tracks whether your content is appearing in ChatGPT, Claude, Gemini, Perplexity, and Grok answers after publication — closing the feedback loop between technical optimisation and actual AI search visibility. When a piece of content that passes the technical checklist is still not appearing in AI search answers, Search Ranking Intelligence surfaces the gap so it can be investigated and addressed.

Iriscale’s Content Architecture ensures that every new piece of content published maintains the internal architecture requirements in Category 6 — mapping internal linking relationships between pillar and cluster content automatically and flagging orphan pages before they accumulate.

Is Iriscale right for your team?

Iriscale is built for B2B SaaS marketing teams at the 50–500 employee stage who need to build AI search visibility alongside traditional organic search — without managing a separate technical audit process for each channel.

If your content ranks on Google but is invisible in AI search answers, if your technical SEO process was built for Google and has never been audited for AI search readiness, if you have no visibility into whether your structured data is correctly communicating your brand entity to AI engines, or if you are publishing content without a systematic AI search optimisation step — Iriscale was built for exactly this.

Book a 30-minute walkthrough and see Iriscale’s AI search readiness tools working on your actual site, your actual content architecture, and your actual competitive landscape.

👉 Schedule a demo

Frequently Asked Questions

What is AI search readiness and why does it matter in 2026?
AI search readiness is the degree to which your site’s technical foundation, content structure, and trust signals meet the evaluation criteria that AI search engines — ChatGPT, Claude, Gemini, Perplexity, and Grok — use when selecting content to cite in AI-generated answers. As B2B buyers increasingly use AI engines to research software purchases, visibility in AI-generated answers has become a meaningful discovery channel. A site that ranks on Google but fails AI search readiness checks is invisible to a growing segment of its addressable buyer audience.

What is the most important technical SEO change for AI search readiness?
Ensuring AI crawler bots are not blocked in your robots.txt file is the most critical and most commonly overlooked AI search readiness issue. Several B2B SaaS sites that have strong Google rankings are blocking GPTBot, ClaudeBot, and other AI crawlers — which means their content cannot be indexed by AI engines regardless of its quality. Audit your robots.txt file for AI crawler blocks before any other technical change.

How does schema markup affect AI search visibility?
Schema markup communicates your content’s structure, authorship, and entity relationships to AI engines in machine-readable format. FAQ schema makes question-and-answer content directly extractable for AI-generated answers. Article schema establishes content authorship and freshness signals. Organisation schema establishes your brand as a verified entity that AI engines can reference in answers. Sites with comprehensive, error-free schema implementation are cited more frequently in AI search answers than sites with no or incorrect schema.

What is the difference between traditional technical SEO and AI search technical SEO?
Traditional technical SEO focuses primarily on crawlability, indexation, page speed, and link signals — the factors that determine how Google discovers, evaluates, and ranks pages. AI search technical SEO encompasses all of these factors plus additional dimensions: AI crawler bot permissions in robots.txt, structured data that enables machine-readable content extraction, content formatting that supports direct answer synthesis, E-E-A-T signals that establish content trustworthiness for citation, and internal architecture that communicates topical authority hierarchies. A site that passes traditional technical SEO checks may still fail multiple AI search readiness criteria.

How does content structure affect AI search citation likelihood?
AI engines select content to cite by matching user query patterns to content structures that contain direct, extractable answers. Content with question-formatted H2 and H3 headings, direct answer statements in the first one to two sentences after each heading, and enumerable content in structured list or table format is significantly more likely to be selected as a citation source than content with the same information presented in unstructured prose. The structural formatting signals that make content easy for a human to skim are the same signals that make content easy for an AI engine to parse.

What is E-E-A-T and why do AI engines use it?
E-E-A-T — Experience, Expertise, Authoritativeness, and Trustworthiness — is the quality framework Google uses to evaluate whether content is credible enough to surface to users. AI engines use similar quality signals because they need to determine whether content is trustworthy enough to cite in answers that carry their own credibility. Content from authors with demonstrated expertise, published on domains with established authority, citing specific sourced data, and supported by user-generated social proof scores higher on E-E-A-T dimensions — and is correspondingly more likely to be selected as an AI search citation source.

How does Iriscale’s AI Optimization Q&A feature work?
Iriscale’s AI Optimization Q&A reviews every piece of content before publishing against the content structure, formatting, and E-E-A-T criteria that AI engines use when selecting citation sources. It identifies specific structural changes — heading reformatting, answer placement, list conversion, schema requirements — that would improve the content’s AI search citation likelihood. This pre-publication review step ensures that every article published through Iriscale’s Articles Hub meets AI search readiness standards without a separate manual audit process.

How do I know if my content is appearing in AI search answers?
Iriscale’s Search Ranking Intelligence tracks whether your brand and your published content are appearing in answers generated by ChatGPT, Claude, Gemini, Perplexity, and Grok for your target queries. It provides a continuous visibility signal across all five major AI engines — showing which content is being cited, which queries are triggering brand mentions, and where visibility gaps exist. This tracking is not available through any traditional SEO platform — it requires a purpose-built AI search monitoring layer.

ARTICLE

Technical SEO Checklist for AI Search Readiness

Why your site ranks on Google but disappears in AI search

This is not a content quality problem. It is a technical readiness problem.

Iriscale’s AI Optimization Q&A and Search Ranking Intelligence automate the tracking and optimisation of many of these signals — which is noted where relevant throughout the checklist.

Category 1: Crawlability and Indexation

AI search engines cannot cite content they cannot access. The foundation of AI search readiness is ensuring your content is completely and correctly crawlable.

1. Robots.txt is not blocking AI crawlers

Several AI platforms use specific crawl bots that are distinct from Googlebot. Review your robots.txt file to ensure it is not blocking:

GPTBot (OpenAI / ChatGPT)
ClaudeBot (Anthropic / Claude)
Google-Extended (Google Gemini)
PerplexityBot (Perplexity)
Meta-ExternalAgent (Meta AI)

2. XML sitemap is current, valid, and submitted

3. Sitemap last-modified dates are accurate

4. All high-value pages return a 200 status code

5. Redirect chains are eliminated

6. Canonical tags are correctly implemented

7. Hreflang is correctly implemented for multilingual sites

8. Crawl budget is not being wasted on low-value pages

9. JavaScript rendering is not blocking content

10. HTTPS is implemented across all pages without mixed content errors

Category 2: Structured Data and Schema

11. Organisation schema is implemented on the homepage

12. Article schema is implemented on all blog and Learn section posts

13. Author schema links to author entity pages

14. FAQ schema is implemented on all FAQ sections

15. HowTo schema is implemented on process and tutorial content

16. Product schema is implemented on product and feature pages

17. BreadcrumbList schema is implemented sitewide

18. SoftwareApplication schema is implemented for SaaS product pages

19. Schema markup validates without errors

20. No conflicting schema types on the same page

Category 3: Content Structure and Formatting

21. Every article has a clear, specific H1 that matches the primary query intent

22. H2 and H3 headings are formatted as questions or direct topic statements

23. Key answers appear in the first 100 words after each heading

24. Numbered and bulleted lists are used for enumerable content

25. Tables are used for comparison and specification content

26. Content contains direct answer statements, not hedged generalities

27. Word count is appropriate for query complexity

28. Images have descriptive, keyword-relevant alt text

29. Content is free of factual inaccuracies and outdated statistics

30. Code blocks are properly formatted for technical content

Category 4: E-E-A-T and Trust Signals

31. Every author has a dedicated author entity page

32. Author bios on articles link to author entity pages

33. About page communicates company expertise and history

34. External links point to authoritative, relevant sources

35. Content cites specific data with sources

36. Review and testimonial content is present and marked up

37. Legal pages are complete and accessible

38. Contact information is complete and consistent

Category 5: Page Experience and Core Web Vitals

39. Largest Contentful Paint (LCP) is under 2.5 seconds

LCP measures how quickly the main content of a page loads. Pages with LCP above 2.5 seconds are penalised in Google’s page experience ranking signals and are crawled less efficiently by AI bots.

40. Cumulative Layout Shift (CLS) is under 0.1

41. Interaction to Next Paint (INP) is under 200 milliseconds

INP measures interactivity — how quickly the page responds to user input. Pages with poor INP scores have degraded page experience signals that reduce ranking and crawl priority.

42. Mobile usability has no errors in Google Search Console

43. Page speed is optimised for server response time

Category 6: Internal Architecture

44. Pillar pages have the highest internal link volume pointing to them

45. Cluster articles link back to their pillar page

46. Orphan pages have zero internal links pointing to them

47. Site depth does not exceed three clicks from the homepage

How Iriscale monitors and maintains AI search readiness

Is Iriscale right for your team?

Book a 30-minute walkthrough and see Iriscale’s AI search readiness tools working on your actual site, your actual content architecture, and your actual competitive landscape.

👉 Schedule a demo

Technical SEO Checklist for AI Search Readiness

Why your site ranks on Google but disappears in AI search

Category 1: Crawlability and Indexation

Category 2: Structured Data and Schema

Category 3: Content Structure and Formatting

Category 4: E-E-A-T and Trust Signals

Category 5: Page Experience and Core Web Vitals

Category 6: Internal Architecture

How Iriscale monitors and maintains AI search readiness

Is Iriscale right for your team?

Frequently Asked Questions

Related Articles

Technical SEO Checklist for AI Search Readiness

Why your site ranks on Google but disappears in AI search

Category 1: Crawlability and Indexation

Category 2: Structured Data and Schema

Category 3: Content Structure and Formatting

Category 4: E-E-A-T and Trust Signals

Category 5: Page Experience and Core Web Vitals

Category 6: Internal Architecture

How Iriscale monitors and maintains AI search readiness

Is Iriscale right for your team?

Frequently Asked Questions

Related Articles