Top AI crawlers, last 30 days
Hits = absolute count. Share = % of total crawler traffic (the 6.7% slice above), not % of all site traffic.
| Bot | Operator | Hits | Share of crawlers | WoW | robots.txt | Note |
|---|---|---|---|---|---|---|
| GPTBot | OpenAI | 2,847 | 46.7% | +18% | GPTBot |
Training + ChatGPT browsing. Aggressive in May 2026. |
| ClaudeBot | Anthropic | 1,612 | 26.5% | +22% | ClaudeBot |
Training crawler. Polite, respects crawl-delay. |
| PerplexityBot | Perplexity | 802 | 13.2% | +30% | PerplexityBot |
Fastest-growing AI crawler in the current snapshot. Answer-grounding focus. |
| OAI-SearchBot | OpenAI | 412 | 6.8% | +12% | OAI-SearchBot |
ChatGPT Search indexing. Separate token from GPTBot. |
| CCBot | Common Crawl | 148 | 2.4% | -5% | CCBot |
Feeds many downstream LLMs. Monthly bulk crawl. |
| Claude-Web | Anthropic | 92 | 1.5% | +8% | Claude-Web |
Claude's user-initiated web fetch. Different from ClaudeBot. |
| Bytespider | Bytedance | 71 | 1.2% | -20% | Bytespider |
Powers Doubao + TikTok search. Heavy traffic on some sites. |
| Applebot-Extended | Apple | 48 | 0.8% | +15% | Applebot-Extended |
Apple Intelligence training opt-out signal. |
| FacebookBot | Meta | 32 | 0.5% | +3% | FacebookBot |
Meta AI. Lower volume than GPT-class. |
| cohere-ai | Cohere | 19 | 0.3% | ±0% | cohere-ai |
Cohere's training fetch. Niche but present. |
| DuckAssistBot | DuckDuckGo | 11 | 0.2% | +25% | DuckAssistBot |
DuckDuckGo AI Assist. Just appeared in our logs this snapshot. |
What's interesting in this snapshot
OpenAI and Anthropic dominate. GPTBot, ClaudeBot, OAI-SearchBot, and Claude-Web together account for over 80% of AI crawler hits we see. PerplexityBot is the fastest-growing, up 30% week-over-week.
Bytespider down. ByteDance's crawler was aggressive through 2024-2025; the current snapshot has it cooled off, consistent with our reading of their robots.txt-respecting patterns improving.
DuckAssistBot just appeared. First showed up in our logs this snapshot. Small volume but rising 25% WoW.
Common Crawl flat. CCBot's volume is structurally bound to its monthly bulk-crawl cadence. Usually it's a single multi-day spike, not steady traffic.
Method
Hits classified via traffic_class_breakdown.
Eight Cloudflare-compatible buckets including
ai_crawler (this index),
ai_user_action (Claude/ChatGPT fetching for a user, not shown here),
and verified_search_bot (Google, Bing, also not shown).
User-agent matching uses
our open
pattern list plus FCrDNS verification for operators that publish
verifiable IP ranges (Anthropic, OpenAI). UA spoofing is mitigated
by reverse-DNS plus forward-DNS round-tripping. Bots claiming to be
GPTBot but routing from cloud IPs without proper verification land
in unverified_bot, not here.
What to do with this data
If you're a site operator:
-
Decide whether to
Allow:orDisallow:each bot in yourrobots.txt. For an SEO/discoverability play, allow GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot. Those are the answer-grounding tokens. For an opt-out-of-training play, disallow them. - Notice if a crawler is over-aggressive (excessive hits, slow page-load impact). Use Cloudflare's bot management or your hosting platform's rate-limiting.
-
Cross-reference with our llms.txt explainer.
Robots allow plus an
llms.txtplus standard SEO is the complete AI-search stack.
Track your own crawler share
Sign up free at mcp-analytics.com, paste the tracking snippet on your site, and ask Claude:
"How much of mysite.com's traffic is ai_crawler?"
"Show me top user agents in the ai_crawler class last 30 days."
"WoW change in GPTBot hits."
Free tier: 100,000 hits/month, unlimited sites, no card. Your contributions feed the aggregate index once we reach launch threshold. No identifiable per-site data is published.