Content Chunking: What It Means and What Google Says to Skip

Content chunking is having a strange year. A wave of SEO advice sells it as the key to AI visibility. Google's official guidance now lists it among the tactics to ignore. Both camps use the same word for different things, which is how a real engineering concept became a shaky content tactic.

This page sorts it out: what chunking actually is inside AI retrieval, what Google said and meant, which claims survive an evidence check, and what to do with your pages instead.

Three meanings of content chunking compared: RAG pipeline step, UX principle, and the SEO hack. — Same word, three jobs — and only one of them is something a page author actually controls.

What Content Chunking Actually Means (Three Different Things)

Content chunking means three different things depending on who is talking, and most of the bad advice comes from mixing them up.

In AI retrieval engineering, chunking is a pipeline step: a retrieval-augmented generation (RAG) system splits documents into segments, converts each segment into an embedding, and pulls back the most relevant segments when a query comes in. The system does the splitting. The document's author is not in the room.

In UX and psychology, chunking is about human working memory. Nielsen Norman Group defines it as breaking content into small, distinct units of information so people can process it, building on George Miller's 1956 finding that most people hold roughly seven chunks in short-term memory. This sense owns most of the Google results for the term, and it has nothing to do with AI search.

Then there is the third sense: content chunking for AEO and GEO purposes, the tactic version. "Chunk your content into AI-sized pieces and engines will pick you." This version borrows the vocabulary of the first sense and the credibility of the second, and it is the one Google now explicitly tells publishers to ignore.

Sense	Who uses it	What it means	Who controls it
RAG pipeline step	AI engineers	Splitting documents into segments for embedding and retrieval	The engine, not you
UX writing principle	Designers, instructional writers	Breaking information into digestible units for human memory	You
AEO/GEO hack	Parts of the SEO industry	Pre-fragmenting pages so AI systems "extract" you	Nobody, which is the problem

This article is about the first and third senses: what chunking really does inside AI search, and why the hack version does not survive contact with the evidence.

How Chunking Really Works Inside AI Retrieval

When an AI search engine processes your page, the chunking happens on its side of the fence. This engine-side splitting is what chunking in AI originally means: the pipeline fetches your content, splits it into chunks, converts each chunk into vector embeddings, and stores them in an index. At query time, the system retrieves the chunks that best match the question, often after fanning the query out into several sub-queries. We cover the full retrieval flow in our breakdown of how AI search retrieval works.

The Engineer's Dials, Not Yours

The part that matters for this debate: chunking strategy is an engineering setting, not a property of your page. Amazon Bedrock's documentation shows what this looks like in practice. A developer building a knowledge base picks fixed-size chunking, semantic chunking, hierarchical chunking, or no chunking at all, and tunes token counts and overlap percentages. Bedrock's default splits text into chunks of roughly 300 tokens while honoring sentence boundaries. Those are the engineer's dials. You never touch them.

Research on RAG chunking treats it the same way: as a system trade-off for the pipeline owner. A systematic analysis of chunking strategies (Bennani and Moslonka, preprint, January 2026) varied chunking method, chunk size, and overlap across a standard RAG setup and found that chunk overlap provided no measurable benefit while increasing indexing cost. The interesting part is the framing. The paper's audience is engineers deploying retrieval systems. Nothing in it suggests document authors should write differently.

There Is No Universal Chunk

Here's the nuance worth keeping. Engineers choose among many splitting strategies, including fixed-size, sliding-window, recursive, semantic, and late chunking, and some of them do read document structure: structure-aware and HTML-aware chunking split at headings and sections, and semantic chunking looks for topic shifts. So a well-organized page can interact better with some pipelines. But which strategy any given engine uses, at what size, with what overlap, is invisible to you and different across engines. There is no universal chunk, so there is nothing stable to optimize against.

The Chunking Hack Google Just Told You to Ignore

Google has now said it directly. Its guide to optimizing for generative AI features, last updated July 2026, lists "'Chunking' content" among the tactics publishers can skip: "There's no requirement to break your content into tiny pieces for AI to better understand it. Google systems are able to understand the nuance of multiple topics on a page and show the relevant piece to users." If your pages run long and cover several topics, that is not a defect to fix.

Chunking sits in that ignore list next to llms.txt files, AI-specific rewriting, and overfocusing on structured data. The same guide we examined when testing schema markup for AI search applies here: Google says tactics like chunking content, unnecessary AI text files, and special generative-AI schema are not required; ordinary SEO foundations still matter.

This was not a one-off documentation edit. On the Search Off the Record podcast in January 2026, Google's Danny Sullivan addressed the bite-sized-chunks advice in unusually plain terms: "So we don't want you to do that. I was talking to some engineers about that. We don't want you to do that. We really don't." Per Search Engine Land's report, he added that Google does not want publishers producing one version of their content for the LLM and another for the web, and that ranking systems keep improving toward rewarding content written for humans.

So where did the tactic come from? The likeliest origin story runs through RAG tutorials. As chunking became a visible concept in AI engineering, its vocabulary leaked into SEO advice, where "the pipeline splits documents into chunks" mutated into "you should split your page into chunks." Productized restructuring retainers did the rest, and a pipeline internal became a billable deliverable.

To be clear about where we stand, since geotoolbox sells AI-visibility measurement: auditing how AI systems actually treat your pages is real work. Restructuring pages to hit chunk specs no engine publishes is the part we would call a tax on confusion.

The Chunk-Size Numbers Nobody Can Tie to AI

Chunking advice comes wrapped in oddly specific numbers. From one widely shared SEO guide and its peers:

Keep passages to 100-300 words
Aim for 500-token chunks
Build macro chunks of 300-800 words and micro chunks of 100-200
Place a heading every 200-300 words
Engines extract 40-60 word excerpts

Try to trace any of these to a study of AI retrieval and you come up short. I found no published evidence tying these specific numbers to AI citation gains. What you find instead is recycling: real numbers from the wrong context. The token counts are defaults from RAG configuration docs, settings for a system the publisher does not run. The 40-60 word figure descends from featured-snippet research that measured what Google displays in a snippet box, not what AI systems prefer to retrieve. The heading-frequency rule matches the readability threshold a popular SEO plugin has flagged for years.

Each number is real somewhere. None is a measured property of AI retrieval, and quoting them as AI-citation targets is like telling drivers to inflate their tires to the PSI spec of someone else's truck.

Even the one well-documented benchmark in this space points the other way. NVIDIA tested chunking strategies across five test datasets and found page-level chunking, treating the whole page as the retrieval unit, delivered the highest average answer accuracy (0.648) of everything tested. That is a benchmark for engineers building internal RAG systems; web publishers are not its audience, and NVIDIA's own refinements vary by corpus and query type. But notice what it implies: in the best-documented test available, bigger chunks won. The micro-fragmentation advice points in the opposite direction from its own favorite evidence genre.

And beneath all of it sits tokenization. Mark Williams-Cook's invalid-schema experiment showed AI assistants happily reading deliberately broken JSON-LD because LLM assistants may still use visible text from broken JSON-LD, which is a weak basis for claiming schema directly drives AI citations. That does not mean search systems never parse structured data. His broader point is the one this whole debate needs: "Be ruthless about the bar of evidence."

In our experience tracking which pages get cited across AI engines at geotoolbox (an observation from our tracking data, not a controlled study), the winners are not the pages hitting word-count targets. They are the ones where a retrieved passage happens to contain a complete, specific, sourced answer. The numbers game optimizes the wrong layer.

What the Evidence Actually Supports

The honest reading is not "chunking is fake." Retrieval is real, and AI answers are often grounded in specific information drawn from retrieved pages, sections, or passages. The question is which claims about that hold up, so here is the chunking debate sorted by evidence tier.

Claim	Verdict	Evidence
AI systems may retrieve pages, sections, or passages, then use specific information from them	Confirmed mechanism	Documented across RAG systems; Google's own guide describes reviewing "specific information from those retrieved pages"
Q&A and structured formats match queries better than dense prose	Tested, small scale	Chris Green's embedding test: Q&A format had the highest semantic match in every scenario, dense prose the worst. His own caveat: small controlled test, and vector similarity is not ranking
Self-contained sections survive extraction better	Confirmed mechanism	A retrieved passage is read without its surroundings; a passage that depends on "as mentioned above" arrives broken. This is retrieval logic, not a study result
Specific chunk-size and heading-frequency targets	Claimed, unsourced	No study connects any circulating number to AI citations; each traces to a non-AI context (RAG config defaults, featured-snippet display stats, readability-tool thresholds)
You can pre-chunk your page to control engine cut points	Rejected	Ahrefs' analysis: chunking happens inside model pipelines, every engine slices differently, and you cannot know which strategy any engine runs; in a fixed-size pipeline your section can land mid-chunk or split across two no matter how it is formatted
Good chunking trains AI or prevents hallucinations about your brand	Rejected	Conflates retrieval-time splitting with model training; no mechanism connects your fragment sizes to what a model learns in training

Notice that the two "confirmed" rows describe the engine's behavior and the writing's properties, while everything rejected describes attempts to control the engine. That is the line through this whole topic.

It also resolves the apparent contradiction that confuses people: Google saying "don't chunk" and AI systems demonstrably retrieving passages are both true. The engine chunks. You cannot chunk for it. What you can do is write passages that hold up no matter where the knife falls.

One more disambiguation, because it trips up even careful readers: enterprise teams really do tune chunking for their own internal knowledge bases, and vendors publish guidance for that. That advice is for people who own the pipeline. The public web is somebody else's pipeline, every engine's settings differ, and none of them publish the dials.

What Survives: Structure That Helps Without the Hack

Strip away the chunk-size folklore and a short list of structural habits remains standing:

Answer-first openings, where the first sentence under a heading answers the heading
Self-contained sections that make sense lifted out of the page
Headings that describe what the section actually says
One idea per passage
Tables and numbered lists where the facts are genuinely list-shaped, because comparisons and sequences survive extraction better as structure than as prose

Here is the difference in one before-and-after. Before: "As we covered earlier, the same problem applies to longer documents." After: "Fixed-size chunking also fails for long contracts, because clause boundaries rarely align with token windows." The first sentence dies when a pipeline lifts it away from the page; the second survives anywhere it lands.

These habits survive for a reason worth being precise about: they are properties of the writing, not settings aimed at any engine's chunker. A self-contained passage works whether the pipeline slices at 300 tokens or takes the whole page, because there is no cut point that leaves the reader, human or model, holding a fragment that depends on text it cannot see.

Which is also why they were good advice before AI search existed. Editors have pushed answer-first structure for decades because readers scan.

The result? You do not need a chunking workflow. You need the writing craft, and we have already covered it in depth: our guide to writing passages LLMs cite handles the sentence-level work, and the answer-first restructuring playbook walks through retrofitting existing pages section by section.

The honest caveat: nobody has published before-and-after data on restructuring projects, so anyone who claims a guaranteed citation lift from reformatting is ahead of the evidence.

Measure Instead of Guessing

The chunking debate is really a proxy for a habit: adopting tactics whose effect you cannot observe. You will never see where an engine cut your page. You can see whether engines cite it.

Track citations on a set of pages before and after a rewrite, against pages you left alone. Keep the caveat in view: this is observational data and engines change underneath you, so treat movement as signal rather than proof. It still beats optimizing a layer you cannot observe at all. And being cited without earning clicks is its own measurement problem, one we cover in our guide to tracking AI visibility.

That is the check worth running before any restructuring project. geotoolbox's Content Analyzer gives your pages a Citability and an AI Readability grade, built on the signals that survive this article's evidence table: answer-first capsules, self-contained sections, sourced data, real tables. Fix the pages with a measurable defect instead of reformatting everything to folklore specs. If the chunking argument lands in your team's Slack again, an audit of your own cited and uncited pages ends it faster than another opinion.

FAQ

Does Google's advice apply to ChatGPT and Perplexity too?

Google's guide speaks for Google, including AI Overviews and AI Mode; no other engine has published equivalent guidance. The mechanism argument still generalizes: ChatGPT and Perplexity run their own retrieval pipelines with their own unpublished chunking settings, so there is still no spec to write toward. What differs between engines is which pages they retrieve and cite, not your ability to control their cut points.

Does Google penalize chunked content?

No. Google's guide calls chunking unnecessary, not harmful, and short sections are not a ranking problem. The damage shows up elsewhere: pages diced into fragments to satisfy an imagined extractor read worse for humans, and Danny Sullivan's warning was aimed exactly at publishing machine-bait instead of writing for people.

Should I rewrite my existing pages to fix their chunking?

Not as a chunking project. No before-and-after data exists on re-slicing pages; the closest published evidence, the GEO benchmark study, tested adding material like statistics, quotations, and citations rather than reformatting what was already there. Start from measurement: protect pages that already earn AI citations, then fix pages where the answer to the title question is genuinely buried. A buried answer is a real defect; a 400-word section is not.

What is semantic chunking?

A pipeline technique that splits documents at topic boundaries detected with embeddings, instead of at fixed token counts. Amazon Bedrock offers it as a configurable option, with buffer sizes and breakpoint thresholds set by whoever builds the knowledge base. Like every chunking strategy, it runs engine-side; you cannot opt your pages into it.

What is context chunking?

A family of retrieval-pipeline techniques that keep chunks from arriving orphaned: semantic chunkers can embed each sentence with a buffer of its neighbors, and hierarchical setups retrieve a small child chunk but hand the model its larger parent section. Amazon Bedrock documents both. Either way, it is configured by pipeline builders; a writer never touches it.

Sources

Google: Optimizing your website for generative AI features on Google Search - developers.google.com/search/docs/fundamentals/ai-optimization-guide
Search Engine Land: Google doesn't want you to create bite-sized chunks of your content - searchengineland.com/google-doesnt-want-you-to-create-bite-sized-chunks-of-your-content-467269
Ahrefs: SEO "Chunk Optimization" is Overrated - ahrefs.com/blog/seo-chunk-optimization
Chris Green: How Content Structure Matters for AI Search - chris-green.net/post/content-structure-for-ai-search
NVIDIA: Finding the Best Chunking Strategy for Accurate AI Responses - developer.nvidia.com/blog/finding-the-best-chunking-strategy-for-accurate-ai-responses
AWS Bedrock documentation: How content chunking works for knowledge bases - docs.aws.amazon.com/bedrock/latest/userguide/kb-chunking.html
Mark Williams-Cook: Schema, LLMs and the Low Bar for Evidence in GEO - markwilliamscook.substack.com/p/schema-llms-and-the-low-bar-for-evidence
Bennani & Moslonka: A Systematic Analysis of Chunking Strategies for Reliable Question Answering (arXiv preprint) - arxiv.org/abs/2601.14123
Nielsen Norman Group: How Chunking Helps Content Processing - nngroup.com/articles/chunking

Content Chunking: What It Means and What Google Says to Skip

What Content Chunking Actually Means (Three Different Things)

How Chunking Really Works Inside AI Retrieval

The Engineer's Dials, Not Yours

There Is No Universal Chunk

The Chunking Hack Google Just Told You to Ignore

The Chunk-Size Numbers Nobody Can Tie to AI

What the Evidence Actually Supports

What Survives: Structure That Helps Without the Hack

Measure Instead of Guessing

FAQ

Does Google's advice apply to ChatGPT and Perplexity too?

Does Google penalize chunked content?

Should I rewrite my existing pages to fix their chunking?

What is semantic chunking?

What is context chunking?

Sources

Get GEO insights in your inbox

What Is RAG (Retrieval-Augmented Generation)?

E-E-A-T for AI Search: What Actually Makes Content Citable

AI Content Optimization: Writing Pages LLMs Cite (2026)

Put this into practice.

GEO Scan

Content Analyzer

Domain Overview

What Content Chunking Actually Means (Three Different Things)

How Chunking Really Works Inside AI Retrieval

The Engineer's Dials, Not Yours

There Is No Universal Chunk

The Chunking Hack Google Just Told You to Ignore

The Chunk-Size Numbers Nobody Can Tie to AI

What the Evidence Actually Supports

What Survives: Structure That Helps Without the Hack

Measure Instead of Guessing

FAQ

Does Google's advice apply to ChatGPT and Perplexity too?

Does Google penalize chunked content?

Should I rewrite my existing pages to fix their chunking?

What is semantic chunking?

What is context chunking?

Sources

Get GEO insights in your inbox

More on this topic

What Is RAG (Retrieval-Augmented Generation)?

E-E-A-T for AI Search: What Actually Makes Content Citable

AI Content Optimization: Writing Pages LLMs Cite (2026)

Put this into practice.

GEO Scan

Content Analyzer

Domain Overview