RAG systems power AI search products like Perplexity, ChatGPT web search, and Google AI Overviews. They work by splitting web pages into chunks, embedding them as vectors, and retrieving the most relevant chunks to answer questions. This tool shows you exactly how your page gets chunked — and whether those chunks make sense.
Fetches your page HTML, extracts the main content text, and tokenizes it using a word-based tokenizer (approximating GPT-style token counts).
The content is recursively split into chunks at three standard sizes used by real RAG systems:
| Chunk Size | Use Case |
|---|---|
| 256 tokens | Fine-grained retrieval — precise answers to specific questions |
| 512 tokens | Balanced — most common in production RAG systems |
| 1024 tokens | Coarse retrieval — broad context for complex questions |
Splitting is recursive: first by headings (H1/H2/H3), then by paragraphs, then by sentences if paragraphs exceed the chunk size.
Each chunk set is scored starting from 100, with penalties for quality issues:
| Penalty | Points | Meaning |
|---|---|---|
| Mid-sentence split | −5 per chunk | A sentence was cut in half between chunks, losing meaning. |
| Structural split | −3 per chunk | A list, table, or code block was split across chunks. |
| Tiny chunk (<50 tokens) | −2 per chunk | Chunk too small to carry meaningful information. |
| Oversized chunk | −3 per chunk | Chunk exceeds the target size significantly. |
GET /api/chunking?url=https://example.com/page
chunks_256, chunks_512, chunks_1024 — arrays of chunk objectstext, token_count, boundary_typehealth_score — 0–100 overall chunk healthtotal_tokens — total token count of the extracted content