The Science of ACRI

How we prove that technical structure predicts AI retrieval success — backed by Shadow RAG calibration on real-world data.

1174986

Domains Analyzed

37.3

Average ACRI

Signal Pillars

ρ > 0

Correlation Proven

What is Shadow RAG?

A Shadow RAG is a controlled, private replica of the retrieval pipelines used by AI systems like ChatGPT, Perplexity, and SearchGPT. We build our own retrieval system — index pages, run queries, measure what gets found — to empirically validate that ACRI predicts real-world AI visibility.

The Experiment

Sample 1,000+ domains across all ACRI tiers (stratified by score)
Embed each page's Golden Semantic String using sentence-transformers
Index into a FAISS vector store (same tech used by AI companies)
Query with 1,000+ synthetic queries across 5 types
Measure Recall@K, MRR, and per-domain Retrieval Score
Correlate ACRI ↔ Retrieval Score with Spearman ρ

The ACRI Formula

ACRI = E^0.35 · S^0.25 · C^0.20 · R^0.20

Pillar	Weight	What it Measures
E — Extractability	35%	Can AI extract clean content? (Token bloat, JS risk, bot access)
S — Semantic Structure	25%	Does content map into embeddings? (Headings, schema, link graph)
C — Content Integrity	20%	Unique, non-thin, information-rich? (Duplicate rate, depth)
R — Retrieval Robustness	20%	LLM-friendly chunks? (Cluster density, hub structure)

The weighted geometric mean ensures that a single weak pillar drags the entire score down — because AI systems need all signals working together.

Key Findings

Higher ACRI = Higher Retrieval

Domains with ACRI > 90 are retrieved 3–4× more accurately than domains with ACRI < 30 in our Shadow RAG experiments. The Spearman rank correlation is statistically significant with 95% bootstrap confidence intervals.

Token Bloat is the #1 Killer

Our ablation study shows that reducing token bloat (HTML noise relative to useful text) has the largest single impact on retrieval probability. Clean HTML = better embeddings.

Schema Boosts Semantic Matching

Structured data (JSON-LD) helps embedding models place your content in the right semantic neighborhood, improving retrieval for entity and comparison queries.

Evaluation Metrics

Recall@K = (queries where correct page is in top K) / (total queries)

MRR = (1/N) · Σ (1 / rank_i)

We evaluate at K = 1, 5, and 10. Statistical significance is assessed via Spearman ρ with 1,000-iteration bootstrap confidence intervals, plus partial correlation controlling for domain authority (Tranco rank).

Reproducibility

The entire pipeline is open and reproducible:

Embedding model: all-MiniLM-L6-v2 (free, open-source)
Vector database: FAISS (CPU, exact search)
Statistics: scipy, pingouin (partial correlation)
Random seed: 42 (deterministic results)

Full code, data logs, and Jupyter notebooks are available in the repository.

Read the Full Whitepaper View ACRI Leaderboard