A page has high text similarity with another page on your site, and the unique information it provides (Information Gain Score) is too low to justify its existence as a separate page. The JSON field is similarity_risk.
Triggered when both conditions are met:
SimilarityMaxPercent > 0 — the page was compared and found similar to another page.SimilarityKind == "exact" OR InfoGainScore < 45The similarity threshold defaults to 92% (configurable). Near-duplicate similarity is directional for programmatic pages — SEODiff only counts it as an issue when InfoGainScore(page) < 45. Exact duplicates always count regardless of InfoGain.
Suppressed when fewer than 10 pages are sampled.
Severity weight: 7. Deductions: −18 on Indexability (dampened as heuristic), −14 on Content. Pages flagged for similarity risk are candidates for consolidation or differentiation.