SEODiff checks access and behavior for five bots that together represent the major AI training and retrieval pipelines:
| Bot | Operator | Purpose | Respects robots.txt |
|---|---|---|---|
| GPTBot | OpenAI | Training data + ChatGPT Browse | Yes |
| ClaudeBot | Anthropic | Training data for Claude models | Yes |
| CCBot | Common Crawl | Open web corpus used by many AI labs | Yes |
| Google-Extended | Gemini training (separate from search) | Yes | |
| Googlebot | Search indexing + AI Overviews | Yes |
Disallow: / rules for AI bot user-agents.