Issue: Thin Content Medium

Pages with too few words or too little unique text to provide value.

What this means

Thin content pages don't have enough substance to be useful to searchers or AI systems. They're at risk of being filtered out of training data and skipped in search results. The JSON field is thin_content_pages.

Detection condition

Triggered when either condition is met:

Both thresholds are configurable in seodiff.yaml under thin_content_checks.

Severity escalation

ConditionSeverity
Word count < 50Critical — extremely thin, likely broken
Words 50–119 or unique ratio < 18%Medium — thin but potentially intentional

This issue is suppressed when fewer than 5 pages are sampled, to avoid false positives from small crawls.

Impact on scores

Severity weight: 7 (medium) or 20 (critical). Deductions: −20 on Indexability, dampened as a heuristic issue. Thin content is a strong signal for search engine quality filters.

Common causes

How to fix

  1. Add substantive, unique content to pages below the 120-word threshold.
  2. For category/listing pages: add introductory text, descriptions, or FAQs.
  3. For pSEO: set a minimum data threshold — don't generate pages for entries with insufficient data.
  4. Consider consolidating multiple thin pages into a single comprehensive page.
  5. If pages are intentionally brief (e.g., contact, redirect), adjust min_words in your config or exclude them from the scan.