Thin content pages don't have enough substance to be useful to searchers or AI systems. They're at risk of being filtered out of training data and skipped in search results. The JSON field is thin_content_pages.
Triggered when either condition is met:
min_words threshold (default: 120 words)min_unique_ratio_percent (default: 18%)Both thresholds are configurable in seodiff.yaml under thin_content_checks.
| Condition | Severity |
|---|---|
| Word count < 50 | Critical — extremely thin, likely broken |
| Words 50–119 or unique ratio < 18% | Medium — thin but potentially intentional |
This issue is suppressed when fewer than 5 pages are sampled, to avoid false positives from small crawls.
Severity weight: 7 (medium) or 20 (critical). Deductions: −20 on Indexability, dampened as a heuristic issue. Thin content is a strong signal for search engine quality filters.
min_words in your config or exclude them from the scan.