Robots.txt Generator

Generate perfect robots.txt file for your website. Control which pages search engines can crawl, set crawl delays, add sitemap location, and manage all search engine bots. Free and easy to use.

Templates:

1 Select Bot
Which bots should follow these rules? Use * for all.
2 Allow / Disallow Paths
3 Sitemap URLs
4 Crawl Delay
Optional. Seconds between bot requests.
GENERATED ROBOTS.TXT
# Robots.txt File
# Start adding rules to generate your robots.txt file
Validation Results
VALID
Start adding rules to see validation results.
Live Validation & Testing

Control What Search Engines Crawl on Your Site

Generate perfect robots.txt files with real-time validation and zero syntax errors

Instant Templates

Start with pre-built templates for WordPress, e-commerce, blogs, or custom configurations

Live Validation

Real-time error detection prevents accidentally blocking your entire site from Google

Zero Syntax Errors

Generated files are perfectly formatted and ready to upload — no typos, no mistakes

🤖

Every website needs a robots.txt file — it's your control panel for search engine crawlers. Without it, Google might index admin pages, duplicate content, thank-you pages, or development areas. With a properly configured robots.txt, you control exactly what appears in search results.

But one typo can be disastrous. Write "Disallow: /" by mistake and your entire site disappears from Google. Block CSS files and your mobile SEO tanks. Our generator eliminates these risks with live validation, template presets, and instant error detection. See exactly what you're creating before it goes live.

How to Generate Your Robots.txt File

01

Choose What to Block

Select which pages or folders search engines should NOT crawl. Common choices include admin areas (/wp-admin/), private folders, duplicate content pages, and thank-you pages. You can also choose to allow all pages by default and only block specific sections.

02

Add Your Sitemap Location

Enter your XML sitemap URL (usually yourwebsite.com/sitemap.xml). This tells search engines where to find your complete list of pages to crawl. Including your sitemap in robots.txt is a best practice that helps search engines discover your content faster.

03

Set Crawl Delay (Optional)

If your server has limited resources or you want to control bot traffic, set a crawl delay (5-10 seconds typical). Most sites don't need this. Leave blank for unlimited crawling, which is best for most websites wanting maximum visibility.

04

Generate and Upload

Click "Generate Robots.txt" to create your file. Copy the generated code, save it as "robots.txt" (exact name, no capitals), and upload to your website root directory. Test at yourwebsite.com/robots.txt to verify it's accessible.

Powerful Robots.txt Generation Features

Block Pages & Folders

Easily block entire folders or specific pages from search engines. Protect admin areas, private content, and duplicate pages with simple selections.

Allow Specific Pages

Override broader blocks to allow specific pages. Block a folder but allow one public page inside it with explicit allow rules.

Add Sitemap Location

Include your XML sitemap URL in robots.txt. Helps search engines find and crawl your sitemap faster for better indexing.

Set Crawl Delays

Control how fast search engine bots crawl your site. Useful for managing server load or bandwidth on shared hosting.

Multiple User-Agents

Create different rules for different search engines. Block one bot while allowing another, or set different crawl speeds per bot.

Copy-Paste Ready

Generated file is properly formatted and ready to upload. No syntax errors, no typos, no formatting issues—just copy and use.

Understanding Robots.txt Directives

Directive Purpose Example When to Use
User-agent Specifies which bot the rules apply to User-agent: * Every robots.txt needs this (use * for all bots)
Disallow Blocks bots from crawling specified paths Disallow: /admin/ Block admin areas, private folders, duplicate content
Allow Explicitly permits crawling (overrides Disallow) Allow: /folder/public.html Allow specific pages within blocked folders
Sitemap Points to your XML sitemap location Sitemap: https://site.com/sitemap.xml Every site should include sitemap location
Crawl-delay Seconds between bot requests Crawl-delay: 10 Only if server load is an issue (most sites skip)

Important: Directives are case-sensitive. Use "User-agent" not "user-agent" or "USER-AGENT". Our generator formats everything correctly automatically.

Advertisement
Ad

When You Need a Robots.txt File

Protect Private Areas

  • Block admin and login pages from search results
  • Prevent indexing of user account areas
  • Hide staging or development sections
  • Protect private file directories
  • Block internal search result pages

Prevent Duplicate Content

  • Block duplicate pages with URL parameters
  • Prevent indexing of print versions of pages
  • Block filtered or sorted product pages
  • Hide pagination duplicates
  • Prevent tag and category page duplication

Optimize Crawl Budget

  • Block low-value pages to focus bot attention
  • Prevent bots from crawling infinite calendars
  • Block resource-heavy pages that strain server
  • Stop bots from crawling search result pages
  • Prevent crawling of thank-you pages

E-commerce Optimization

  • Block shopping cart and checkout pages
  • Prevent indexing of "add to cart" URLs
  • Hide comparison pages with URL parameters
  • Block customer account and order pages
  • Prevent crawling of wishlist URLs

Common Search Engine User-Agents

Different search engines use different "user-agents" to identify themselves. You can create specific rules for each bot or use "User-agent: *" to apply rules to all bots.

Major Search Engine Bots

Search Engine User-Agent What It Crawls
Google Googlebot All content for Google Search
Google Images Googlebot-Image Images for Google Image Search
Google Mobile Googlebot-Mobile Mobile version of pages
Bing Bingbot All content for Bing Search
Yahoo Slurp Content for Yahoo Search (uses Bing)
DuckDuckGo DuckDuckBot Content for DuckDuckGo Search
Baidu (China) Baiduspider Content for Baidu Search
Yandex (Russia) Yandex Content for Yandex Search

Other Common Bots

  • Applebot: Crawls for Siri and Spotlight search
  • Facebot: Facebook's crawler for link previews
  • Twitterbot: Twitter's crawler for card previews
  • LinkedInBot: LinkedIn's crawler for shared links
  • PingdomBot: Website monitoring service
  • AhrefsBot: SEO tool crawler (often heavy)
  • SemrushBot: SEO tool crawler

Using Wildcards

User-agent: * means "all bots." This is the most common approach—set rules once and they apply to every search engine. Only use specific user-agents if you need different rules for different bots (rare for most websites).

Robots.txt Best Practices and Common Mistakes

Always Allow Your Homepage

Never block your homepage (/) in robots.txt. This prevents your entire site from appearing in search results. If you see "Disallow: /" in your robots.txt, your site is blocking all search engines from crawling anything.

Don't Use Robots.txt to Remove Indexed Pages

Blocking pages that are already indexed keeps them in search results but prevents updates. Use meta noindex tags or Google Search Console removal tool instead. Robots.txt prevents crawling, not indexing.

Test Before Going Live

Use Google Search Console robots.txt Tester to verify your file works correctly. Test specific URLs to ensure they're blocked or allowed as intended. One typo can block your entire site accidentally.

Don't Block CSS and JavaScript

Never block /css/ or /js/ folders in robots.txt. Google needs these files to properly render pages for mobile-first indexing. Blocking them hurts your mobile search rankings significantly.

Use Trailing Slashes Correctly

"Disallow: /folder" blocks /folder and anything starting with /folder. "Disallow: /folder/" blocks only the folder directory. Know the difference to avoid accidentally blocking more than intended.

Include Your Sitemap

Always add your sitemap location to robots.txt. This helps search engines discover your sitemap faster. You can list multiple sitemaps if needed (sitemap index, image sitemap, news sitemap).

Robots.txt Examples for Common Scenarios

Example 1: Basic WordPress Site

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Disallow: /wp-login.php
Disallow: /wp-content/plugins/
Disallow: /wp-content/themes/

Sitemap: https://yourwebsite.com/sitemap.xml

What this does: Blocks admin area but allows admin-ajax.php (needed for some features), blocks login page, hides plugin and theme folders, and includes sitemap location.

Example 2: E-commerce Site

User-agent: *
Disallow: /cart/
Disallow: /checkout/
Disallow: /my-account/
Disallow: /*?add-to-cart=
Disallow: /*?orderby=
Disallow: /*?filter=

Sitemap: https://yourwebsite.com/sitemap.xml

What this does: Blocks cart, checkout, and account pages. Prevents indexing of product pages with URL parameters (add-to-cart, sorting, filtering) that create duplicate content.

Example 3: Blog Site

User-agent: *
Disallow: /admin/
Disallow: /search?
Disallow: /*?s=
Disallow: /author/
Disallow: /tag/

Sitemap: https://yourwebsite.com/sitemap.xml
Sitemap: https://yourwebsite.com/news-sitemap.xml

What this does: Blocks admin, search result pages, author archives (if not valuable), and tag pages (can cause thin content issues). Includes both regular and news sitemaps.

Example 4: Site Under Development

User-agent: *
Disallow: /

What this does: Blocks entire site from all search engines. Use this ONLY during development before launch. Remember to update this when you go live or your site will never appear in search results!

Example 5: Allow Everything

User-agent: *
Disallow:

Sitemap: https://yourwebsite.com/sitemap.xml

What this does: Allows all bots to crawl entire site. Empty "Disallow:" means no restrictions. Good for small sites with no private areas or duplicate content concerns.

How to Upload Robots.txt to Your Website

After generating your robots.txt file, you need to upload it to your website's root directory. The file MUST be accessible at yourwebsite.com/robots.txt (not in a subfolder).

Upload via FTP/SFTP

  1. Copy the generated robots.txt code
  2. Create a text file named "robots.txt" (exactly this, lowercase, no .txt.txt)
  3. Paste the code into this file and save
  4. Connect to your website via FTP using FileZilla or similar client
  5. Navigate to the root directory (usually /public_html/ or /www/)
  6. Upload robots.txt to this directory (not in subfolders)
  7. Visit yourwebsite.com/robots.txt to verify it's accessible

Upload via cPanel File Manager

  1. Log into your cPanel hosting account
  2. Click "File Manager"
  3. Navigate to public_html folder (root directory)
  4. Click "New File" or "Upload"
  5. Name it "robots.txt" exactly
  6. Right-click the file and choose "Edit"
  7. Paste your generated robots.txt code
  8. Save and verify at yourwebsite.com/robots.txt

WordPress Specific Instructions

WordPress users have additional options:

  • Manual Upload: Use FTP or cPanel as described above
  • Yoast SEO Plugin: Go to SEO → Tools → File Editor to edit robots.txt directly
  • Rank Math Plugin: Go to Rank Math → General Settings → Edit robots.txt
  • All in One SEO: Go to All in One SEO → Tools → Robots.txt Editor

Shopify

Shopify generates robots.txt automatically. To customize:

  1. Go to Online Store → Themes → Actions → Edit Code
  2. Add a new template: Templates → Add a new template
  3. Select "robots" from template type
  4. Paste your robots.txt code
  5. Save and verify

Wix

Wix also auto-generates robots.txt. To customize:

  1. Go to Settings → SEO Tools
  2. Click "Advanced SEO" at top
  3. Scroll to "Custom robots.txt"
  4. Toggle on and paste your code
  5. Save changes

Testing Your Robots.txt File

After uploading robots.txt, always test to ensure it works correctly and doesn't accidentally block important pages.

Basic Accessibility Test

Visit yourwebsite.com/robots.txt in a web browser. You should see your robots.txt file contents displayed as plain text. If you get a 404 error, the file isn't in the right location or isn't named correctly.

Google Search Console Robots.txt Tester

The most reliable testing tool:

  1. Log into Google Search Console
  2. Select your property (website)
  3. Go to Legacy Tools → robots.txt Tester
  4. See your current robots.txt file displayed
  5. Enter specific URLs in the test box at bottom
  6. Click "Test" to see if URL is blocked or allowed
  7. Green "Allowed" means page can be crawled
  8. Red "Blocked" means page is blocked by robots.txt

What to Test

Test these important URLs to verify configuration:

  • Homepage: Should always be allowed
  • Important product/service pages: Should be allowed
  • Blog posts: Should be allowed
  • Admin pages: Should be blocked
  • Cart/checkout pages: Should be blocked (e-commerce)
  • Search result pages: Should be blocked if configured

Common Testing Mistakes

  • Not waiting for changes: After updating robots.txt, it may take minutes to hours for search engines to recognize changes
  • Testing wrong URLs: Make sure to test full URLs including https:// and correct domain
  • Assuming blocking = removal: Blocking pages doesn't remove them from search if already indexed
  • Not testing mobile: Some rules may affect mobile crawlers differently

Common Robots.txt Mistakes to Avoid

Mistake 1: Blocking Entire Site by Accident

Wrong: User-agent: * / Disallow: /

This blocks everything from all search engines. Your site will disappear from search results. Only use this during development, never on live sites.

Mistake 2: Using Robots.txt to Hide Sensitive Information

Robots.txt is publicly accessible—anyone can read it at yourwebsite.com/robots.txt. Don't list secret folders or files thinking you're hiding them. Use proper security (passwords, server configuration) to protect sensitive content.

Mistake 3: Blocking CSS and JavaScript

Wrong: Disallow: /css/ or Disallow: /*.js$

Google needs CSS and JavaScript to render pages properly. Blocking these files hurts mobile SEO significantly. Never block stylesheet or script folders.

Mistake 4: Wrong File Name or Location

File must be named exactly "robots.txt" (lowercase, no space) and placed in root directory. Common errors:

  • Robots.txt (capital R) - won't work
  • robot.txt (missing s) - won't work
  • robots.txt.txt (double extension) - won't work
  • Placed in subfolder like /admin/robots.txt - won't work

Mistake 5: Trying to Remove Indexed Pages

If pages are already in Google search results, adding them to robots.txt does NOT remove them. Blocking crawling prevents updates but keeps pages indexed. To remove pages from search, use:

  • Meta robots noindex tag on the page
  • Google Search Console URL removal tool
  • X-Robots-Tag HTTP header

Mistake 6: Forgetting to Update for Live Launch

Many sites block everything during development with "Disallow: /" then forget to update robots.txt when launching. Result: site never appears in search results. Always review robots.txt before launching.

Advanced Robots.txt Tips

Using Wildcards

Most search engines support wildcards for pattern matching:

  • * (asterisk): Matches any sequence of characters. Example: Disallow: /*?s= blocks all URLs containing ?s=
  • $ (dollar sign): Matches end of URL. Example: Disallow: /*.pdf$ blocks all PDF files

Example: Disallow: /*?add-to-cart= blocks all URLs with "add-to-cart" parameter (common in WooCommerce).

Multiple Sitemaps

You can list multiple sitemap locations in robots.txt:

Sitemap: https://yourwebsite.com/sitemap.xml
Sitemap: https://yourwebsite.com/sitemap-images.xml
Sitemap: https://yourwebsite.com/sitemap-news.xml

Bot-Specific Rules

Create different rules for different bots:

User-agent: Googlebot
Disallow: /private/

User-agent: Bingbot
Disallow: /private/
Crawl-delay: 10

User-agent: *
Disallow: /

This allows Google and Bing with different crawl speeds, but blocks all other bots.

Dealing with Aggressive Bots

Some SEO tool crawlers can overload servers. Block them individually:

User-agent: AhrefsBot
Disallow: /

User-agent: SemrushBot
Disallow: /

User-agent: *
Disallow:

This blocks specific aggressive bots while allowing legitimate search engines.

Advertisement
Ad

Frequently Asked Questions

What is a robots.txt file?

A robots.txt file is a simple text file placed in your website root directory that tells search engines like Google which pages they can or cannot crawl. It acts as instructions for search engine bots, helping you control which parts of your site appear in search results. For example, you might block admin pages, thank-you pages, or duplicate content from being indexed while allowing product pages and blog posts.

How do I install a robots.txt file on my website?

Create the robots.txt file using our generator, then upload it to the root directory of your website. The file must be accessible at yourwebsite.com/robots.txt (not in a subfolder). For WordPress, use an FTP client or file manager to upload it to the main website folder. For other platforms like Shopify or Wix, check their specific instructions for adding robots.txt files.

What should I block in robots.txt?

Common pages to block include: admin and login pages (/admin/, /login/), thank you and confirmation pages, search result pages (/?s=), duplicate content pages, staging or development areas, private folders with sensitive files, and pages with URL parameters that create duplicate content. Never block your entire site unless intentionally preventing all search engines from indexing it.

Does blocking pages in robots.txt remove them from Google?

No! Robots.txt blocks crawling but does not remove already-indexed pages. If a page is already in Google search results, blocking it in robots.txt keeps it there but prevents Google from updating its information. To remove pages from search, use meta robots noindex tag or Google Search Console URL removal tool, not robots.txt.

Should I add my sitemap to robots.txt?

Yes! Adding your XML sitemap location to robots.txt helps search engines find and crawl your sitemap faster. Add the line "Sitemap: https://yourwebsite.com/sitemap.xml" to your robots.txt file. You can list multiple sitemaps if you have them. This is a recommended best practice for all websites.

What is crawl delay and should I use it?

Crawl delay tells search engine bots to wait a specified number of seconds between page requests. Set crawl delay only if your server struggles with bot traffic or you have bandwidth concerns. Most sites do not need crawl delay. Google ignores crawl delay directives (use Google Search Console instead), but Bing and other search engines respect it. Common values are 5-10 seconds.

Can I block specific search engines like Google or Bing?

Yes! Use specific user-agent directives to target individual search engines. "User-agent: Googlebot" blocks only Google, "User-agent: Bingbot" blocks only Bing. However, blocking major search engines means your site won't appear in their search results, significantly reducing your traffic. Only do this if you have specific reasons to exclude certain search engines.

What is the difference between Allow and Disallow?

Disallow blocks search engines from crawling specified pages or folders. For example, "Disallow: /admin/" blocks everything in the admin folder. Allow explicitly permits crawling of pages within a broader blocked area. For example, if you block "/folder/" but want to allow "/folder/public-page/", use "Allow: /folder/public-page/". Allow directives override Disallow directives when there's a conflict.

How do I test if my robots.txt file is working?

After uploading your robots.txt file, visit yourwebsite.com/robots.txt in a browser to verify it displays correctly. Then use Google Search Console robots.txt Tester tool to check if Google can read it and test specific URLs against your rules. The tester shows if URLs are blocked or allowed, helping you verify your configuration works as intended.

Is this robots.txt generator free?

Yes! Our robots.txt generator is completely free with no registration or limits. Generate perfect robots.txt files for unlimited websites. Create rules for any search engine, set crawl delays, add sitemaps, and download ready-to-use files. All features included at no cost.