Free Web Tools

Robots.txt Generator

Generate a robots.txt file to control how search engines and AI crawlers access your website. Presets for WordPress, Shopify, Laravel and more. 100% free.

CMS Presets Block AI Bots Allow Rules Crawl-delay Multiple Sitemaps

Website URL

Your homepage URL — used to auto-fill the Sitemap field

Quick Presets

User-agent: * — Rules for All Crawlers

Allow Paths

Use to create exceptions inside a blocked folder, e.g. /wp-admin/admin-ajax.php. Leave empty to use the default Allow: /.

Folders to Disallow

/ /

Files to Disallow

Crawl-delay optional

seconds between requests

⚠️ Google ignores Crawl-delay. Use Google Search Console to control Googlebot's crawl rate instead.

Block Specific Bots (optional — each gets its own User-agent block)

Click Block AI Bots above to auto-fill with all major AI crawlers, or add individual bot names.

Sitemap URLs optional

4.9

★★★★★ 15 reviews

What Is a robots.txt File?

A robots.txt file is a plain text file placed in the root directory of your website (e.g. https://yoursite.com/robots.txt) that tells search engine crawlers and bots which pages or sections they are allowed or not allowed to visit. It is the first file most bots check before they begin crawling.

A minimal, correct robots.txt looks like this:

User-agent: *
Allow: /
Disallow: /wp-admin/

Sitemap: https://yoursite.com/sitemap.xml

Key robots.txt Directives Explained

User-agent: Specifies which bot the following rules apply to. Use * for all bots, or a specific name like Googlebot or GPTBot.

Disallow: Tells the bot not to crawl the specified path. Disallow: /private/ blocks the entire private folder.

Allow: Overrides a Disallow rule for a more specific path. Useful to allow one file inside a blocked folder.

Crawl-delay: Requests the bot to wait N seconds between requests. Google ignores this — use Google Search Console instead.

Sitemap: Points crawlers directly to your XML sitemap. Can appear multiple times for multiple sitemaps.

Should You Block AI Bots?

Since 2023, a wave of AI crawlers have been scraping the web to collect training data for large language models. Major AI labs like OpenAI, Anthropic, and ByteDance now operate their own crawlers — and most of them respect the robots.txt standard.

If you don't want your content used to train AI models, you can add dedicated User-agent blocks to your robots.txt:

User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: CCBot
Disallow: /

Use the Block AI Bots preset above to generate rules for all major AI crawlers in one click: GPTBot (OpenAI), ClaudeBot (Anthropic), CCBot (Common Crawl), Bytespider (TikTok), Amazonbot, PerplexityBot, and more.

Frequently Asked Questions

Will robots.txt prevent my pages from appearing in Google?

Not necessarily. Blocking a page in robots.txt prevents Google from crawling it, but if another site links to it, Google may still index it and show it in results — just without a description. To completely prevent a page from appearing in search, use a <meta name="robots" content="noindex"> tag on the page itself.

Where do I place my robots.txt file?

Upload robots.txt to the root of your domain — it must be accessible at https://yoursite.com/robots.txt. It cannot be placed in a subdirectory. For WordPress sites, most themes and plugins manage it automatically.

What's the difference between robots.txt and noindex?

robots.txt controls whether a bot can crawl (visit) a page. noindex controls whether a page appears in search results. If you block a URL in robots.txt, Google can't read its noindex tag — so use noindex for pages you want removed from results, and robots.txt for pages you want to save crawl budget on.

Can I have multiple User-agent sections?

Yes. You can have separate rule sets for each bot. Each User-agent: directive starts a new block. This lets you allow all crawlers but block specific ones, or apply different Crawl-delay values to different bots.