What is Retrieval-Augmented Generation (RAG)?

Retrieval-Augmented Generation (RAG): Definition

Retrieval-Augmented Generation (RAG)

Also: RAG, retrieval augmented generation

Retrieval-augmented generation (RAG) is the technique behind most AI search: instead of answering only from memory, the model retrieves relevant documents at query time and grounds its generated answer in them, then cites the sources it used. It is why fresh, reachable, extractable pages can be quoted right away.

Updated May 31, 2026

RAG separates what the model knows from what it can look up. The retrieval step is where your content enters the answer, which is why being in the index and easy to extract matters more than waiting for a model to be retrained on you.

The name describes the sequence: retrieve relevant documents, augment the prompt with them, then generate the answer. The technique was introduced in a 2020 paper by researchers at Facebook AI. Engines retrieve a candidate pool, evaluate it for authority and agreement, then generate an answer citing the strongest sources.