G
GEO Toolbox

AI Crawlers

Bytespider

Also: ByteDance crawler

Bytespider is ByteDance's web crawler, used to collect data to train its large language models. It is known for crawling aggressively and reportedly does not always respect robots.txt, so blocking it often takes a server or WAF rule rather than robots.txt alone.

Updated

Bytespider identifies with the user agent Bytespider. Unlike a search crawler, its purpose is gathering training data for ByteDance's AI models, so allowing it offers little visibility benefit in Western AI search.

Because it reportedly ignores robots.txt in some cases, owners who want to block it usually do so at the server or firewall level by matching the user agent, not in robots.txt alone.