Reference
GEO Glossary.
Short, plain-English definitions of the terms behind generative engine optimization and AI search. Each one links to the full guide.
AI Crawlers
CCBot
CCBot is the crawler operated by Common Crawl, a nonprofit that publishes a free, open dataset of web pages. Because that dataset is widely used to train large language models, CCBot is one of the most common indirect routes your content takes into AI systems. It respects robots.txt.
ChatGPT-User
ChatGPT-User is the OpenAI agent that fetches a specific web page in real time when a user's ChatGPT prompt requires it, such as following a link or answering a question about a page. It is distinct from GPTBot (training) and OAI-SearchBot (search indexing).
ClaudeBot
ClaudeBot is Anthropic's web crawler, used to gather publicly available content for Claude. It identifies with the user agent ClaudeBot and respects robots.txt, so you allow or block it the same way you would any other crawler.
Google-Extended
Google-Extended is a robots.txt token that controls whether your content can be used to train Google's AI models, including Gemini, and to ground AI answers. Crucially, it does not affect your ranking in Google Search: blocking it removes you from one AI surface, not from search.
GPTBot
GPTBot is OpenAI's web crawler that gathers publicly available content which may be used to train its models. You control it through robots.txt. It is separate from OAI-SearchBot, the crawler that surfaces pages in ChatGPT's search answers, so blocking GPTBot opts you out of training without removing you from ChatGPT search.
llms.txt
llms.txt is a markdown file at a site's root that gives AI systems a curated map of its most important content. As of 2026 it is not a Google Search or AI Overviews ranking signal, but Google's Chrome Lighthouse now audits for it as an agentic-browsing best practice, so it is becoming low-cost infrastructure for helping AI agents navigate your site.
OAI-SearchBot
OAI-SearchBot is OpenAI's crawler that surfaces and links websites in ChatGPT's search answers. It respects robots.txt and is separate from GPTBot (training): if you block OAI-SearchBot you can disappear from ChatGPT search results, even though you stay eligible for training.
Perplexity-User
Perplexity-User is the agent Perplexity uses to fetch a specific page in real time when a user's question requires it. Because the request is user-initiated, it generally ignores robots.txt, so a robots.txt block will not stop a direct user-driven fetch.
PerplexityBot
PerplexityBot is Perplexity's crawler, designed to surface and link websites in Perplexity's search results. Perplexity states it is not used to train foundation models and recommends allowing it in robots.txt. Blocking it removes you from the index Perplexity builds answers from.
robots.txt
robots.txt is a plain-text file at the root of a site that tells crawlers which paths they may or may not fetch, by user agent. For AI search it is the primary control for allowing or blocking crawlers like GPTBot, ClaudeBot, PerplexityBot, and Google-Extended. Most well-behaved AI crawlers respect it.
AI Search Surfaces
AI Overviews
AI Overviews are Google's AI-generated summaries that appear at the top of some search results. Powered by Gemini, they synthesize an answer from multiple web sources and link to a few of them, so users often get the answer without clicking. Being one of the cited sources is the goal of optimizing for AI Overviews.
Answer Engine
An answer engine is a search tool that returns a single synthesized answer to a question rather than a list of links to evaluate. Perplexity, Google's AI Overviews, and ChatGPT search are answer engines. Optimizing to be cited in them is the focus of answer engine optimization (AEO).
Featured Snippet
A featured snippet is the boxed answer Google pulls from a ranking page and shows at the very top of results, above the blue links. It predates AI search but rewards the same thing AI engines do: a concise, directly-stated answer that can be lifted out of the page.
Google AI Mode
Google AI Mode is Google's conversational, AI-first search experience, powered by Gemini, that returns a generated answer you can follow up on rather than a traditional list of ten links. It sits alongside AI Overviews as a surface where being a cited source, not a ranked link, is the goal.
Zero-Click Search
A zero-click search is a query that ends without the user clicking through to any website, because the answer is shown directly on the results page, in a featured snippet, knowledge panel, or AI Overview. It is the core reason ranking can hold steady while traffic falls.
GEO Concepts
AI Citation
An AI citation, in the AI-search sense, is when an AI search engine references and links your page as a source in its answer. Citations are the core unit of AI visibility: in a zero-click world, being one of the cited sources, not ranking a blue link, is what gets your brand seen.
AI Search Engine Optimization
AI search engine optimization is the umbrella term for improving how your brand appears in AI-powered search, across ChatGPT, Perplexity, Gemini, and Google AI Overviews. It is the plain-English name for the same discipline also called generative engine optimization (GEO) and answer engine optimization (AEO).
Answer Engine Optimization (AEO)
Answer engine optimization (AEO) is the practice of structuring content to win the direct answer in answer engines, including featured snippets, voice results, and AI answers. It overlaps almost entirely with generative engine optimization (GEO); the two are the same job described from different angles.
Entity SEO
Entity SEO is optimizing around clearly-defined entities (your brand, people, products, and concepts) and the relationships between them, rather than around keyword strings. It helps search and AI engines recognize who you are and connect facts to you, which makes you safer to cite.
LLM SEO (LLMO)
LLM SEO, sometimes called LLMO, is optimizing content to be surfaced and cited by large language model tools like ChatGPT and Claude. It frames the work around the models specifically, but in practice it is the same discipline as generative engine optimization (GEO) and answer engine optimization (AEO).
Schema Markup (Structured Data)
Schema markup is code, usually JSON-LD using the schema.org vocabulary, that labels what the content on a page means so machines can parse it. It does not force AI citations, and Google says no special schema is required for AI features, but it helps engines understand your entities and content.
Share of Voice (AI)
AI share of voice is the percentage of AI answers, across a defined set of prompts, in which your brand is mentioned or cited, compared with competitors. It is the closest thing to a ranking metric in AI search: not a position, but how often you are part of the answer.
How AI Search Works
AI Hallucination
An AI hallucination is when a model generates false or fabricated information and presents it as fact. In AI search it shows up as wrong claims, invented sources, or incorrect brand details. Grounding answers in retrieved, citable sources is the main defense, which is why clear, sourced content matters.
Content Chunking
Content chunking is structuring a page into self-contained units, each making sense on its own, so AI systems can retrieve and cite a single passage cleanly. Engines pull chunks, not whole pages, so a section that stands alone without 'as mentioned above' is far easier to quote.
Grounding
Grounding is the practice of tying an AI model's answer to verifiable external sources retrieved at query time, rather than relying on the model's internal memory alone. Grounded answers cite where each claim came from, which reduces hallucination and makes your content quotable.
Knowledge Graph
A knowledge graph is a structured network of entities (people, places, brands, concepts) and the relationships between them, often modeled as triples (subject, relation, object). Google's Knowledge Graph powers knowledge panels and helps engines understand who you are. A clear, consistent entity presence makes you easier to recognize and cite.
Query Fan-Out
Query fan-out is the technique where an AI search engine expands a single user question into many parallel sub-queries, retrieves results for each, and synthesizes one answer. Ask for a 5-day trip to Japan and it quietly searches hotels, weather, train passes, and more at once.
Retrieval-Augmented Generation (RAG)
Retrieval-augmented generation (RAG) is the technique behind most AI search: instead of answering only from memory, the model retrieves relevant documents at query time and grounds its generated answer in them, then cites the sources it used. It is why fresh, reachable, extractable pages can be quoted right away.
Semantic Search
Semantic search matches content to a query by meaning and intent rather than exact keywords, using vector embeddings to compare concepts. It is why AI engines can pull your page for a question that does not contain your exact phrasing, and why writing for intent beats writing for keyword strings.
Vector Embeddings
A vector embedding is a numerical representation of text (or images, audio) that captures its meaning as a point in high-dimensional space. Pieces of content with similar meaning sit close together, which lets AI systems retrieve relevant passages by similarity. Embeddings power semantic search and RAG.