AI Crawlers

robots.txt

Also: robots file

robots.txt is a plain-text file at the root of a site that tells crawlers which paths they may or may not fetch, by user agent. For AI search it is the primary control for allowing or blocking crawlers like GPTBot, ClaudeBot, PerplexityBot, and Google-Extended. Most well-behaved AI crawlers respect it.

Updated June 11, 2026

A single overly broad rule, such as a catch-all User-agent: * disallow meant to stop scrapers, can silently block the AI crawlers you actually want. Check each AI user agent explicitly.

User-initiated fetchers are the exception, and behavior varies by vendor: OpenAI says robots.txt rules may not apply to ChatGPT-User, Perplexity says Perplexity-User generally ignores them, while Anthropic says blocking Claude-User in robots.txt does stop its fetches.

Go deeper

How to Optimize for AI Search

AI Visibility Audit: A Step-by-Step Checklist
AI Crawlers: The List of AI Bots & How to Control Them
Agentic Browsers Are Here: Is Your Site Agent-Ready?

Related terms

llms.txt GPTBot Google-Extended

All terms Run a free scan