SEO Tools

Robots.txt Checker

Validate your robots.txt syntax and test whether any URL path is allowed or blocked by Googlebot, Bingbot, GPTBot, or any custom user-agent.

About robots.txt

The robots.txt file tells search engine crawlers which pages or sections of your site they should or should not index. It lives at the root of your domain: https://example.com/robots.txt.

  • User-agent: * applies to all crawlers. Use a specific agent name to target one crawler.
  • Disallow: /path blocks the path. Allow: /path explicitly permits it.
  • More specific rules (longer paths) take precedence over shorter ones.
  • robots.txt only suggests — well-behaved crawlers follow it, malicious bots may not.
  • Always add a Sitemap: directive pointing to your XML sitemap.

FAQ

Does Disallow: / block my entire site?

Yes. Disallow: / tells crawlers to skip every page on your site. This is sometimes used for staging environments but should never be deployed on a live production site.

Can I allow Googlebot but block other bots?

Yes. Use a specific User-agent: Googlebot block with Allow: /, and a wildcard block with Disallow: / for all others.

Does robots.txt stop pages from appearing in Google?

No. Blocking crawling prevents Google from reading the content, but the URL can still appear in results if other sites link to it. To prevent indexing, use a noindex meta tag instead.

What is the GPTBot?

GPTBot is OpenAI's web crawler, used to train AI models. You can block it with User-agent: GPTBot followed by Disallow: /.

Related Tools