Medium — Crawl Access

Blocked by robots.txt

Your robots.txt file is preventing Google (and other crawlers) from accessing pages you want indexed. This is one of the simplest GSC errors to fix — but one of the most costly when misconfigured, because blocked pages can never rank.

Tests robots.txt, meta robots, and bot access for 10+ crawlers. No account required.

Common robots.txt Mistakes

Blocking entire directories. Disallow: /blog/ when you only meant to block one page.
Wildcard overkill. Disallow: /*? blocks all URLs with query parameters, including legitimate paginated content.
Staging rules in production. Disallow: / left over from a dev/staging environment. This blocks the entire site.
Blocking CSS/JS resources. Google needs access to CSS and JS files to render pages. Blocking /wp-content/ or /static/ breaks rendering.
User-agent mismatches. Rules targeting User-agent: Googlebot vs User-agent: * behave differently. Typos in user-agent names silently fail.

How to Fix It

Audit your robots.txt. Use GSC's robots.txt tester to verify which URLs are blocked.
Use SEODiff's Crawl Access Checker to test access for Googlebot, GPTBot, ClaudeBot, and other crawlers simultaneously.
Be specific. Instead of blocking entire directories, use precise path patterns.
Allow CSS/JS/images. Never block /static/, /assets/, or /wp-content/themes/.
Test before deploying. Always test robots.txt changes against your URL list before pushing to production.

robots.txt vs meta robots vs noindex

These are different mechanisms with different effects:

robots.txt Disallow — Prevents crawling. Google can't see the page at all. If external links point to it, Google may still index the URL (with no snippet).
meta robots noindex — Allows crawling but prevents indexing. Google sees the page but won't show it in results.
X-Robots-Tag: noindex — Same as meta robots, but set via HTTP header. Works for non-HTML files (PDFs, images).

Important: If you block a page in robots.txt AND have a noindex meta tag, Google will never see the noindex tag (because it can't crawl the page). The noindex is effectively invisible.

Don't accidentally block your best pages

SEODiff tests crawl access for Google, AI bots, and LLM crawlers in one scan.

Check Crawl Access →

Blocked by robots.txt

Common robots.txt Mistakes

How to Fix It

robots.txt vs meta robots vs noindex

Don't accidentally block your best pages

Related Diagnostics