# robots.txt for https://www.example.com/ # Purpose: Block bad bots, allow Google, and rate-limit basic crawlers # ========================= # Allow Google Crawlers # ========================= User-agent: Googlebot Disallow: User-agent: Googlebot-Image Disallow: User-agent: Googlebot-News Disallow: User-agent: Googlebot-Video Disallow: User-agent: AdsBot-Google Disallow: # ========================= # Block Common Bad Crawlers # ========================= User-agent: AhrefsBot Disallow: / User-agent: SEMrushBot Disallow: / User-agent: MJ12bot Disallow: / User-agent: DotBot Disallow: / User-agent: MauiBot Disallow: / User-agent: BLEXBot Disallow: / User-agent: Yandex Disallow: / User-agent: Baiduspider Disallow: / User-agent: PetalBot Disallow: / User-agent: Sogou Disallow: / User-agent: spbot Disallow: / User-agent: Nutch Disallow: / User-agent: Scrapy Disallow: / # ========================= # Slow Down Basic Crawlers # ========================= User-agent: * Crawl-delay: 10 Disallow: /tmp/ Disallow: /private/ Disallow: /admin/ # ========================= # Sitemap (important for SEO) # ========================= Sitemap: https://www.example.com/sitemap.xml