Why Claude Can't Access Your Website: How to Check ClaudeBot, Claude-User, and Crawler Access
Learn why Claude may not be able to access your website, how ClaudeBot and Claude-User differ, and what to check in robots.txt, CDN, WAF, redirects, and crawler logs.
TL;DR
- If Claude cannot access your website, the problem may be crawler access, not page quality.
- You need to distinguish ClaudeBot from Claude-User.
- The most common blockers are
robots.txt, CDN/WAF rules, bot challenges, redirects, login walls, and pages that rely too heavily on client-side rendering. - The useful workflow is: identify the crawler -> check the exact URL -> inspect the response -> fix access blockers -> monitor whether Claude-related crawlers revisit.
Contents
- What "Claude can't access my website" can mean
- ClaudeBot vs Claude-User
- Step 1: check whether the page is publicly fetchable
- Step 2: review robots.txt rules
- Step 3: check CDN, WAF, and bot protection
- Step 4: inspect redirects and status codes
- Step 5: monitor Claude crawler visits over time
- Claude access checklist
What "Claude can't access my website" can mean
When someone says Claude cannot access a website, they may mean several different things.
They might mean:
- Claude does not mention the website in an answer.
- Claude cannot read a URL pasted into a chat.
- Claude says it cannot access the page.
- Claude gives an outdated answer about the company.
- Claude summarizes a third-party page instead of the source page.
- Claude-related crawlers never show up in logs.
- Claude-related crawlers request the page but receive a bad response.
Those are different problems.
Some are answer-quality issues. Some are retrieval issues. Some are crawler access issues. Some are site infrastructure issues.
The first step is to separate the question:
Can Claude know about this site?
from:
Can Claude-related crawlers actually fetch the page?
For CrawlConsole, the second question is the operational one.
ClaudeBot vs Claude-User
Do not treat every Claude-related request as the same signal.
Two important crawler contexts are:
The names matter because they can imply different situations.
ClaudeBot is usually discussed as Anthropic's crawler for discovering and accessing web content at crawler scale.
Claude-User is more closely tied to user-triggered retrieval behavior, where someone may ask Claude to access or reason over a specific URL.
That difference changes how you interpret the traffic.
For example:
- If ClaudeBot requests your public blog post, that may be a broader discovery signal.
- If Claude-User requests a specific page after someone shares a URL, that may be closer to user-driven retrieval.
- If neither appears, Claude may be learning about you through other sources, or not discovering the page at all.
- If either crawler receives a 403, 404, redirect loop, or bot challenge, the page may be technically visible to humans but functionally unavailable to Claude.
This is why crawler identity is the first layer of diagnosis.
Step 1: check whether the page is publicly fetchable
Start with the exact URL you care about.
Check whether the page:
- returns 200
- loads without login
- is not blocked by a consent wall
- does not require a human-only browser challenge
- has meaningful server-rendered content
- is not hidden behind a hash route or app shell
- has a canonical tag pointing to itself or the intended canonical URL
- can be reached from internal links
This matters because Claude-related systems may not see your page the way you see it in a browser.
If your browser loads the page after JavaScript, cookies, location redirects, or security checks, that does not prove a crawler received the same content.
The practical question is:
What did the crawler receive on the first request?
Step 2: review robots.txt rules
Next, review robots.txt.
Look for rules that affect:
- ClaudeBot
- Claude-User
- broad
User-agent: *rules - blocked paths such as
/blog/,/docs/,/pricing/, or/product/ - accidental disallows from old staging or migration rules
Example problem:
User-agent: *
Disallow: /blog/
That might look harmless if the rule was meant for an old section, but it can block the exact content you want AI systems to discover.
Another common problem is blocking crawlers by broad category without deciding what each crawler should be allowed to access.
The better approach is selective:
- allow public pages that support AI visibility
- block duplicate, private, or expensive paths
- avoid using robots.txt as a security mechanism
- verify actual crawler responses after changes
For a broader crawler lookup workflow, use Web Crawlers to identify crawler names before writing rules.
Step 3: check CDN, WAF, and bot protection
robots.txt is only one layer.
A crawler can be allowed by robots.txt and still fail because of:
- Cloudflare or CDN bot rules
- WAF challenges
- rate limits
- geo rules
- user-agent blocks
- IP reputation rules
- JavaScript challenges
- security plugins
- origin server blocks
This is one of the most common reasons a site works for humans but fails for AI crawlers.
The page loads in your browser.
The crawler sees:
- 403 Forbidden
- challenge page
- blocked request
- redirect to a generic page
- empty app shell
- timeout
For Claude access, this matters because a blocked crawler cannot use the content you want it to see.
If you want a page to be eligible for AI discovery, check the actual response Claude-related crawlers receive.
Step 4: inspect redirects and status codes
Status codes tell you whether the visit was useful.
For each Claude-related request, check:
- exact URL requested
- final URL after redirects
- status code
- response size
- whether the content was the intended page
- whether the crawler hit a mobile, geo, or language redirect
- whether query parameters changed the page
Useful examples:
- ClaudeBot requests
/blog/ai-crawler-analyticsand receives 200. - Claude-User requests
/docsand receives 200. - ClaudeBot requests
/pricingand receives 403. - Claude-User requests a product page and gets redirected to the homepage.
The first two are useful access signals.
The last two are problems to investigate.
This is also where comparison across crawlers helps. If GPTBot and OAI-SearchBot can access the same page but Claude-related crawlers cannot, the issue may be specific to a user-agent, security rule, or crawler policy.
Step 5: monitor Claude crawler visits over time
Do not check once and stop.
Monitor Claude-related crawler behavior after:
- publishing a new page
- updating a blog post
- adding internal links
- changing robots.txt
- changing WAF or CDN rules
- publishing docs
- launching a product page
- adding WebMCP or agent-readable context
The key questions are:
- did ClaudeBot or Claude-User visit?
- which page did it request?
- what response did it receive?
- did it revisit after the update?
- did it reach related pages through internal links?
- did other AI crawlers find the same page?
This is where crawler access becomes part of Agent Experience.
If Claude can technically access a page, but it never discovers the related docs, product page, or WebMCP context, you may still have a discovery problem.
Use crawler data to decide what to link, fix, allow, block, or monitor next.
Claude access checklist
Use this checklist when diagnosing whether Claude can access your website.
Crawler identity
- Did ClaudeBot visit?
- Did Claude-User visit?
- Are you grouping Claude-related crawlers separately from other AI crawlers?
- Are you comparing Claude behavior against GPTBot, OAI-SearchBot, and other crawlers?
Page access
- Does the target page return 200?
- Is it public without login?
- Does it avoid human-only challenges?
- Is it internally linked?
- Is it in the sitemap if it should be discoverable?
- Does it have meaningful content without relying entirely on client-side rendering?
Robots and security
- Does
robots.txtallow the intended path? - Are Claude-related user agents accidentally blocked?
- Are CDN/WAF rules challenging or blocking the crawler?
- Are rate limits or bot protection rules creating false negatives?
- Does the crawler receive the intended page content?
Full-picture monitoring
- Can you map Claude-related crawlers to exact URLs?
- Can you see status codes and redirects?
- Can you identify when access changed?
- Can you monitor revisits after updates?
- Can you connect crawler behavior to internal links, docs, product pages, and WebMCP?
The bottom line: Claude access is not a yes/no question. You need to know which Claude-related crawler visited, which page it requested, what response it received, and whether it came back after the page changed.
