Issue: Duplicate Clusters Medium

Groups of 2 or more pages with near-identical content within the scanned sample.

What this means

SEODiff groups pages with content similarity above the configured threshold (default 92%) into clusters. Each cluster represents a set of pages that are near-duplicates of each other. The JSON field is duplicate_clusters.

Detection condition

Triggered when DuplicateClusterSize ≥ 2 and a cluster ID is present. The maximum number of similarity comparisons is capped at 1,500 pairs by default. Suppressed when fewer than 10 pages are sampled.

Impact on scores

Severity weight: 7. Deductions: −12 on Content, dampened as heuristic. Large clusters amplify the impact because they affect multiple pages.

How to fix

Review each cluster in your SEODiff report to understand what makes the pages similar.
Differentiate content by adding unique data, sections, or context per page.
Consolidate truly duplicate pages behind a single canonical URL.
For pSEO: review your template to ensure sufficient unique data injection per page.

Issue: Duplicate Clusters Medium

What this means

Detection condition

Impact on scores

How to fix

Related