AI crawlers fall into two broad jobs. Training crawlers like GPTBot or Bytespider gather data to build models. Search and retrieval crawlers like OAI-SearchBot or PerplexityBot fetch live pages that get cited in answers. Blocking the first protects training data; blocking the second removes you from AI answers.
Most respect robots.txt, so you can allow or disallow each by name. The most common visibility failure is not a deliberate block but a WAF or bot-management rule that catches AI crawlers as collateral, leaving pages reachable to browsers but invisible to the engines writing answers.