haveibeenpwned.com 64 C
🛡️ SEO 19 🤖 GEO 71 ⚡ Perf 80 🏗️ Arch 100

haveibeenpwned.com — Global SEODiff Score 64/100

haveibeenpwned.com
📊

The AI-Readiness profile for haveibeenpwned.com is strong: an ACRI of 76/100 places it ahead of 84% of domains in the index. Compared to other developer sites (avg score: 57), haveibeenpwned.com performs above the benchmark, suggesting strong competitive positioning in AI search. The low ghost ratio (5%) confirms that what crawlers see matches what users see — a hallmark of strong SSR implementation. A 5.5× bloat ratio is typical for sites in this tech tier — not wasteful, but streamlining could further boost extractability. The complete absence of JSON-LD schema is a missed opportunity: even basic Organization markup would improve how AI crawlers understand this domain. All major AI bot user-agents (GPTBot, ClaudeBot, CCBot, Google-Extended) are permitted by robots.txt, ensuring broad AI crawler access.

64
C — Global SEODiff Score
Comprehensive search visibility assessment
Strong foundations, but Traditional SEO (19) is your bottleneck.
🎯 Top Fix: Add Organization + WebSite JSON-LD → +5–8 pts
🔬 Automated SEODiff Assessment · Snapshot: Feb 27, 2026 · 📋 API
📈 ACRI Trend 23 snapshots
Feb 22 Feb 27
🔔 Recent AI Indexing Activity
No recent changes detected by adaptive crawler.
Does your site score higher than haveibeenpwned.com?
Run the same 40-signal audit on your own domain — free, instant results.
Scan Your Site Free →
🧮 Score Transparency — How is this calculated?
🛡️ Traditional SEO (25% weight)19 × 0.25 = 4.8
🤖 AI Readiness / GEO (40% weight)71 × 0.40 = 28.4
⚡ Performance (20% weight)80 × 0.20 = 16.0
🏗️ Architecture & Trust (15% weight)100 × 0.15 = 15.0
Weighted sum = 4.8 + 28.4 + 16.0 + 15.0
Global SEODiff Score = 64 (C)
📊 ACRI Sub-Scores (AI Readiness Detail)
100
Bot Access
avg 92
99
Rendering
avg 93
68
Structure
avg 36
0
Schema
avg 9
75
Tech Stack
avg 63
🔀
Visibility Delta: Google vs AI
Google (Tranco)
Top 0.7%
Rank #7199
+16 pts
Gap
AI (ACRI)
Top 16%
Score 76/100

haveibeenpwned.com punches above its weight in AI — AI visibility exceeds Google ranking. This is a competitive moat worth protecting. ACRI measures technical crawler readiness. Read the methodology →

Why haveibeenpwned.com ranks here

Tech stackCloudflare Pages
Industrydeveloper
RenderingSSR
Schema coverage0 blocks
Token bloat5.5×

Fastest improvements

  • Add basic Organization and WebSite JSON-LD to fix “0 schema blocks” (see Schema Coverage).
  • Reduce token bloat (navigation/footer/code) so agents reach your main content faster (see Token Bloat).
  • Create an llms.txt file so AI crawlers can discover your content structure without heavy crawling. Generate llms.txt →
  • Run a full entropy audit to find which DOM regions waste the most tokens. Run Entropy Audit →
🧪

JavaScript Rendering Check

We check what AI crawlers miss when they skip JavaScript execution.

Running headless browser to simulate AI extraction…
🛡️

Traditional SEO

19/100 25 % of Global Score 🟡 Medium Confidence

📝 Title Tag

80 chars
Too long

Optimal range: 30–60 characters for SERP display.

📋 Meta Description

99 chars
Too short

Optimal range: 120–160 characters for snippet control.

🔤 Heading Hierarchy

  • ✗ Exactly 1 <h1> tag — found 0
  • ✓ Has <h2> headings — found 1
  • ✗ <h2> not before <h1>

🔍 Indexability

  • ✗ Canonical tag present
  • ✓ No noindex directive
  • ✓ Meta viewport set
  • ✓ HTML lang attribute → en
  • ✗ Hreflang tags
  • ✓ Googlebot allowed by robots.txt

🌐 Social / OpenGraph

  • ✓ og:title — Have I Been Pwned: Check if your email address has been exposed in a data breach
  • ✓ og:description — Have I Been Pwned allows you to check whether your email address has been exposed in a data breach.
  • ✓ og:image — preview
  • ✓ twitter:card — summary_large_image
📐 How the SEO Pillar score is calculated

SEO Pillar = Title (20 pts) + Meta Desc (20 pts) + Heading Hierarchy (20 pts) + Indexability (20 pts) + Social/OG (20 pts)

Each sub-score is derived from the checks above. Canonical tag, lang attribute, og:image, and a single H1 are the highest-impact items.

🤖

AI Readiness / GEO

71/100 40 % of Global Score 🟢 High Confidence

This pillar aggregates citation share, hallucination risk, bot access, schema health, and content extractability. The individual diagnostic sections below contribute to this score.

🔗

Citation Alternatives

Research
💡
Insight: In the developer sector, hikkoshizamurai.jp (ACRI: 88) currently has stronger AI extractability. AI models tend to prefer sources with higher semantic structure and schema coverage. Domains with ACRI < 40 see 3.5× more hallucinations. Read the research →
haveibeenpwned.com
52
Your ACRI Score
88
Industry Peer ACRI
AI models prioritize pages with strong semantic structure and schema coverage. hikkoshizamurai.jp has schema coverage of 5 blocks and uses Custom / Proprietary. Improve your score by implementing the remediation patches below.
📊 Side-by-Side Comparison →
🚨

Hallucination Risk

Research

Is AI lying about your brand? This panel measures how likely LLMs are to hallucinate facts when extracting information from your page.

Analyzing hallucination risk…

🤖 Bot Access Matrix

GPTBot (OpenAI)
Allowed
ClaudeBot (Anthropic)
Allowed
CCBot (Common Crawl)
Allowed
Google-Extended
Allowed
Googlebot
Allowed

👻 Rendering (Ghost Ratio) Docs

Ghost Ratio 5%
0% — Safe 50% 100% — Risk
Status Server-Side Rendered (Safe)
Rendering Type SSR

📊 Structure & Information Density Docs

Structure Grade 68/100 — Good
Structured Elements 35 elements (31 lists, 1 rows, 3 headers)
Total Words256
Raw Density13.7%

🏷️ Schema Health Docs

Organization Schema ❌ Missing
Product / Service Schema ⚠️ Not Found
Total Schema Blocks0 — No JSON-LD detected

Schema Coverage Map

0/7 schema types detected
❌ Organization
❌ Product/Service
❌ Breadcrumb
❌ FAQ
❌ Article
❌ WebSite
💡Organization schema missing. AI models cannot identify your brand entity. Without it, your brand won't appear in Knowledge Panels or be associated with your content.
💡Product / Service schema missing. AI models don't know this is a SaaS product. Add Product or SoftwareApplication schema so AI understands what you offer and can surface pricing/features.
💡BreadcrumbList schema missing. AI cannot understand your site hierarchy or how pages relate to each other.
💡FAQ schema missing. Adding FAQPage schema lets AI models directly extract Q&A pairs for Featured Snippets and chatbot answers.
💡WebSite schema missing. Add WebSite + SearchAction so Google can generate a Sitelinks Search Box for your brand in AI results.

📐 AI Efficiency Metrics Docs

63
AI Extractability
Low
Crawl Cost
None
Blocklist Risk
Extractability63/100 — AI models can partially extract answers from this page
Crawl CostLow (10/100) — efficient for AI crawlers to process
Blocklist RiskNone — 0 of 5 AI crawlers blocked

Token Bloat Research

18%
🗑️ 82%
Useful Content (5.1 KB)Bloat (22.7 KB)
Token Bloat Ratio5.5× — Normal

Multimodal Readiness

Visual Context100% Optimized for Vision
Image Alt Coverage3 / 3 images have alt text

TDM Rights

TDM-Reservation HeaderNot set
X-Robots-Tag: noaiNot set

🔥 Structural Entropy Check Research

0 Entropy
Poor Token Bloat: High
Noise Ratio: 81.7% · SNR: 0.22 · Signal: 1297 / Noise: 5800 tokens

🔬 AI-Crawler Simulation

See your website the way AI crawlers do. CSS stripped, structure labeled, content chunked.

🌐
This is what humans see — styled, branded, visual.
Toggle to "AI Agent View" to see what GPTBot, ClaudeBot, and other AI crawlers actually extract from this page.
🤖

AI Answer Preview

NEW

See how AI models summarize your site. Left: your actual content. Right: what the LLM extracts and says about you.

Simulating AI extraction…
🧠

The LLM Interpretation

AI-VERIFIED

A local LLM (mlx-community/gemma-3-4b-it-qat-4bit) analyzed the extracted content of haveibeenpwned.com and produced this structured business intelligence. Fields marked SEMANTIC VOID indicate information the AI could not find — a critical gap in your site’s machine-readability.

Core Offering
Have I Been Pwned is a free resource that helps users assess if their online accounts have been compromised in
Target Audience
Individuals concerned about online security, data breaches, and compromised accounts; cybersecurity professionals; general public.
Pricing Model
Free to use; potential for future premium features (not currently implemented).
🔗 Integration Partners
LinkedInMastodonBluesky
🏆 Competitive Moat
Comprehensive database of breached credentials and post-breach analysis, combined with a simple and accessible user interface.
📊 Content Depth
6/10
🔄 Programmatic SEO Signals
FAQ page provides detailed information about breaches and compromised accounts.
⚡ Key Pain Points
• Lack of structured FAQ schema
• Limited SEO optimization for specific breach details
Model: mlx-community/gemma-3-4b-it-qat-4bit · Analyzed: 2026-02-28 · Data extracted from the site’s main content via strict JSON prompting.

🔧 Tech Stack

AI-Readiness Score75/100
Servercloudflare
CDNcloudflare
HTTP Status200
Load Time521 ms
Raw HTML Size27.7 KB
Visible Text Size5.1 KB

Performance & Speed

80/100 20 % of Global Score 🟢 High Confidence

⏱️ Time to First Byte

521 ms
Acceptable — room for improvement

Google considers <200 ms "good". AI crawlers may have even shorter timeouts.

📦 Page Weight

261
DOM nodes
28 KB
HTML payload
Lean page — fast for bots and users

🗄️ Cache & CDN

  • ✓ Cache-Control header → public,max-age=3600
  • ✓ CDN cache status → HIT
  • ✓ CDN detected → cloudflare

🔬 Tracker Tax

0
tracker scripts
0
third-party domains
0.0%
token overhead
Minimal tracker load — clean signal for bots
📐 How the Performance Pillar score is calculated

Perf Pillar = TTFB (35 pts) + Page Weight (25 pts) + Cache/CDN (20 pts) + Tracker Tax (20 pts)

TTFB <200 ms = full marks. DOM >3000 or payload >300 KB incurs heavy penalties. Tracker scripts beyond 5 reduce score.

🏗️

Architecture & Trust

100/100 15 % of Global Score 🟢 High Confidence

🗺️ Sitemap & Robots

  • ✓ Sitemap declared in robots.txt → https://haveibeenpwned.com/sitemap.xml
  • ✓ Googlebot allowed
  • ✓ GPTBot allowed
  • ✓ ClaudeBot allowed

🔗 Linking

38
internal links
7
external links
Good internal linking — helps crawlers discover content

🔒 Security & Trust

  • ✓ HSTS header (Strict-Transport-Security)
  • ✓ Content-Security-Policy header
  • ✓ HTTP status 200 OK (got 200)

♿ Accessibility Signals

  • ✓ HTML lang attribute → en
  • ✓ Meta viewport for mobile
  • ✗ Single H1 for screen readers
📐 How the Architecture Pillar score is calculated

Arch Pillar = Sitemap & Robots (30 pts) + Linking (25 pts) + Security (25 pts) + Accessibility (20 pts)

Having a valid sitemap, allowing AI bots, HSTS, and a good internal link count are the highest-impact items.

🏅 AI-Verified Trust Badge

Your site scores 52/100. Reach 80+ to unlock the green "AI-Verified" badge. Fix the issues below to improve your score.

AI-Verified badge for haveibeenpwned.com
Pending Audit — score below 80 threshold
<a href="https://seodiff.io/radar/domains/haveibeenpwned.com" rel="noopener"><img src="https://seodiff.io/api/v1/badge?domain=haveibeenpwned.com" alt="AI-Verified by SEODiff" width="280" height="52"></a>

💡 Paste in your site footer, GitHub README, or email signature. Badge updates automatically as your score changes.

� Deep Crawl Analysis 969 pages · Deep-10

Homepage ACRI
52
Single-page score
-5
Consistent readability
Δ delta
Site-Wide ACRI
48
Avg across 969 pages · Range 26–82
Topical Cohesion
4%
Topical Drift
TF-IDF cosine similarity
Total Words
252826
Avg Bloat
28.4×
Page Type ACRI Token Bloat Words Status
https://haveibeenpwned.com/API/V3
Have I Been Pwned: API Documentation
pricing 82 3.5× 6067 💰 Pricing
https://haveibeenpwned.com/TermsOfUse
Have I Been Pwned: Terms of Use
pricing 82 3.1× 4902 💰 Pricing
https://haveibeenpwned.com/Dpa
Have I Been Pwned: Data Processing Addendum
pricing 82 3.4× 2594 💰 Pricing
https://haveibeenpwned.com/Privacy
Have I Been Pwned: Privacy Policy
pricing 82 3.8× 2722 💰 Pricing
https://haveibeenpwned.com/FAQs
Have I Been Pwned: Frequently Asked Questions
pricing 77 6.3× 2692 💰 Pricing
https://haveibeenpwned.com/About
Have I Been Pwned: Who, What & Why
pricing 67 10.5× 591 💰 Pricing
https://haveibeenpwned.com/about
Have I Been Pwned: Who, What & Why
pricing 67 10.5× 591 💰 Pricing
https://haveibeenpwned.com/Breach/Zoosk
Have I Been Pwned: Zoosk (2011) Data Breach
pricing 67 14.4× 540 💰 Pricing
https://haveibeenpwned.com/Breach/JustDate
Have I Been Pwned: Justdate.com Data Breach
pricing 67 14.8× 514 💰 Pricing
https://haveibeenpwned.com/Breach/Emotet
Have I Been Pwned: Emotet Data Breach
pricing 57 17.5× 404 💰 Pricing
https://haveibeenpwned.com/Breach/Eroticy
Have I Been Pwned: Eroticy Data Breach
pricing 57 16.1× 490 💰 Pricing
https://haveibeenpwned.com/Breach/FLVS
Have I Been Pwned: Florida Virtual School Data Breach
pricing 57 16.7× 443 💰 Pricing
https://haveibeenpwned.com/Breach/FreedomHostingII
Have I Been Pwned: Freedom Hosting II Data Breach
pricing 57 16.4× 438 💰 Pricing
https://haveibeenpwned.com/Breach/Fridae
Have I Been Pwned: Fridae Data Breach
pricing 57 17.9× 396 💰 Pricing
https://haveibeenpwned.com/Breach/Gab
Have I Been Pwned: Gab Data Breach
pricing 57 17.3× 418 💰 Pricing
https://haveibeenpwned.com/Breach/GenesisMarket
Have I Been Pwned: Genesis Market Data Breach
pricing 57 17.2× 450 💰 Pricing
https://haveibeenpwned.com/Breach/GetRevengeOnYourEx
Have I Been Pwned: Get Revenge On Your Ex Data Breach
pricing 57 17.6× 415 💰 Pricing
https://haveibeenpwned.com/Breach/HookersNL
Have I Been Pwned: Hookers.nl Data Breach
pricing 57 18.1× 390 💰 Pricing
https://haveibeenpwned.com/Breach/Qakbot
Have I Been Pwned: Qakbot Data Breach
pricing 57 17.6× 398 💰 Pricing
https://haveibeenpwned.com/Breach/RetinaX
Have I Been Pwned: Retina-X Data Breach
pricing 57 17.2× 410 💰 Pricing
Showing 20 of 100 pages. Unlock full subpage table →
📂
Health by Sub-Directory
Average ACRI and top issues aggregated by URL path prefix
Path Pages Avg ACRI Ghost % Bloat Top Issue
/Breach/ 489 50 0% 26.8× High JS Bloat
/API/ 1 82 0% 3.5× Healthy
/Dpa/ 1 82 0% 3.4× Healthy
/about/ 1 67 0% 10.5× High JS Bloat
/Passwords/ 1 57 0% 16.2× High JS Bloat
/PwnedWebsites/ 1 56 0% 41.1× High JS Bloat
/Partners/ 1 54 0% 15.4× High JS Bloat
/Privacy/ 1 82 0% 3.8× Healthy
/TermsOfUse/ 1 82 0% 3.0× Healthy
/FAQs/ 1 77 0% 6.3× High JS Bloat
/About/ 1 67 0% 10.5× High JS Bloat
/Subscription/ 1 54 0% 18.8× High JS Bloat
🔗
Outbound External Citations
0 unique external domains cited across 969 pages
x.com ×969
github.com ×969
linkedin.com ×969
haveibeenpwned.uservoice.com ×969
merch.haveibeenpwned.com ×969
bsky.app ×969
infosec.exchange ×969
facebook.com ×969
🔄 Re-Crawl & Update 📡 Track this Domain

Scores update automatically each month. Create a free account for on-demand re-crawls (3/month free).

🔌 API Access

Pull this data programmatically. All sub-page metrics are available via our public API.

curl https://seodiff.io/api/v1/deep10/domain/haveibeenpwned.com

Get your free API key — 100 requests/month included.

🔗 Similar developer Sites

Domains with a similar tech stack, industry, and AI readiness profile to haveibeenpwned.com. Compare side-by-side.

Domain ACRI AI Score Tech Stack Token Bloat Schema
haveibeenpwned.com (this site) 52 76 Cloudflare Pages 5.5× 0
homenhoney.com 77 15 Cloudflare Pages 5.0× 4 Compare →
stores.orvis.com 77 81 Joomla 2.5× 0 Compare →
1-butsudan.jp 77 87 WordPress 4.1× 1 Compare →
mitre.com 77 89 Shopify 5.5× 2 Compare →
showmeyourmumu.com 77 89 Shopify 5.4× 2 Compare →
Compare All 5 Similar Sites →

📊 Semantic Share of Voice

How often would an AI cite haveibeenpwned.com when users ask about topics in this domain's niche? We run entity queries through our 188k-page search index and measure citation probability.

Analyzing citation landscape…

🎭

Bait & Switch Delta

B 15 PAGES

Compares your homepage rendering quality with inner pages. A high drift score means AI crawlers see a polished homepage but degraded inner content — the "bait & switch" that erodes trust.

67
Homepage ACRI
45
Inner Avg ACRI
+22
ACRI Delta
0%
Homepage Ghost
17%
Inner Avg Ghost
20
Drift Score [?]
Worst Inner Pages
39 20% pricing https://haveibeenpwned.com/Breach/ApexSMS
39 20% pricing https://haveibeenpwned.com/Breach/APOIAse
54 20% pricing https://haveibeenpwned.com/Partners
🛡️

E-E-A-T Trust Signals

D 25/100

Trust indicators extracted from surface pages. These signals help AI systems verify your site's Experience, Expertise, Authoritativeness, and Trustworthiness.

Physical Address
Phone Number
Email Contact
About Page
Contact Page
Privacy Policy
Terms of Service
Named Leadership
🔗

Citation Profile

31 DOMAINS

Outbound citation patterns across surface-crawled pages. Sites that cite diverse, authoritative sources signal higher E-E-A-T to AI systems.

157
Total Links
31
Unique Domains
10.5
Avg/Page
20%
Diversity
haveibeenpwned.uservoice.com linkedin.com github.com x.com merch.haveibeenpwned.com facebook.com bsky.app infosec.exchange 1password.com troyhunt.com
🏘️ Outbound Neighborhood Trust Avg Trust: 33.3

AI trust scores for the domains haveibeenpwned.com links to. Citing high-trust sources lifts your own credibility signal.

🩹

Remediation Patches

COPY-PASTE

Auto-generated code fixes tailored to haveibeenpwned.com. Copy and paste these into your codebase to improve AI visibility. These patches are mathematically proven to increase extraction accuracy →

Add Organization JSON-LD
High Impact ⏱ 5 min
AI models cannot identify your brand entity without Organization schema. This is the #1 fix for AI visibility.
html
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Haveibeenpwned",
  "url": "https://haveibeenpwned.com",
  "logo": "https://haveibeenpwned.com/apple-touch-icon.png",
  "sameAs": []
}
</script>
Add WebSite + SearchAction JSON-LD
High Impact ⏱ 5 min
Enables the Sitelinks Search Box in Google and allows AI to understand your site structure.
html
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "WebSite",
  "name": "Haveibeenpwned",
  "url": "https://haveibeenpwned.com",
  "potentialAction": {
    "@type": "SearchAction",
    "target": "https://haveibeenpwned.com/search?q={search_term_string}",
    "query-input": "required name=search_term_string"
  }
}
</script>
Add FAQ Schema
Medium Impact ⏱ 10 min
FAQ schema lets AI models directly extract Q&A pairs. This is the easiest way to get featured in AI responses.
html
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "What is Haveibeenpwned?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Add your answer here — describe what Haveibeenpwned does in 1-2 sentences."
      }
    },
    {
      "@type": "Question",
      "name": "How does Haveibeenpwned work?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Explain the key features and how users interact with Haveibeenpwned."
      }
    }
  ]
}
</script>
📈

Projected Impact

ROI EST.

If you apply the patches above, here's the estimated improvement for haveibeenpwned.com:

Current Score
76
Projected Score
92
Improvement
+16 pts
Add Organization schema +6 pts
Add WebSite schema +4 pts
Reduce token bloat +3 pts
Add FAQ schema +3 pts

*Estimates based on SEODiff's scoring model. Actual results depend on implementation quality.

📋 Data Export

Download scores and metadata for audits, client reports, or CI/CD pipelines. Exports contain computed metrics only (no copyrighted content).

All data is generated automatically and updated with each crawl. JSON exports contain scores and metadata only (no copyrighted content).

Is this your company?

Monitor your AI visibility score weekly and get alerted when changes happen.

Start Free →

🧭 Self-Diffing (Private Layer)

For owned domains, combine this world snapshot with private drift + regression history.
Template Drift
Track in My Site
Drift → Traffic Impact
In development coming soon
Regression Incidents
Track in My Site
Internal Linking
Deep Audit graph
Semantic Structure
GEO view in Deep Audit
Content Quality
Thin/duplicate tracking

🕒 History

Score over timeAvailable in My Site history
Drift eventsTemplate timeline + incidents
Drift → Revenue AttributionComing soon
Schema/rendering/extractability changesTracked per scan in project history