Tool: AI Chunking Simulator

Simulates how RAG (Retrieval-Augmented Generation) systems split your content into vector search chunks.

What it does

RAG systems power AI search products like Perplexity, ChatGPT web search, and Google AI Overviews. They work by splitting web pages into chunks, embedding them as vectors, and retrieving the most relevant chunks to answer questions. This tool shows you exactly how your page gets chunked — and whether those chunks make sense.

How it works

Step 1: Extract & tokenize

Fetches your page HTML, extracts the main content text, and tokenizes it using a word-based tokenizer (approximating GPT-style token counts).

Step 2: Split at 3 sizes

The content is recursively split into chunks at three standard sizes used by real RAG systems:

Chunk SizeUse Case
256 tokensFine-grained retrieval — precise answers to specific questions
512 tokensBalanced — most common in production RAG systems
1024 tokensCoarse retrieval — broad context for complex questions

Splitting is recursive: first by headings (H1/H2/H3), then by paragraphs, then by sentences if paragraphs exceed the chunk size.

Step 3: Score chunk health

Each chunk set is scored starting from 100, with penalties for quality issues:

PenaltyPointsMeaning
Mid-sentence split−5 per chunkA sentence was cut in half between chunks, losing meaning.
Structural split−3 per chunkA list, table, or code block was split across chunks.
Tiny chunk (<50 tokens)−2 per chunkChunk too small to carry meaningful information.
Oversized chunk−3 per chunkChunk exceeds the target size significantly.

How to improve chunk health

API endpoint

GET /api/chunking?url=https://example.com/page

JSON output