Can I add Oh Dear to my robots.txt?

Yes. Our crawler follows robots.txt, so you can tell it which parts of your site to stay out of, same as you would for Googlebot or any other well-behaved bot.

A typical robots.txt

Most robots.txt files look something like this:

User-agent: *
Disallow:

That tells every crawler that respects robots.txt that it can visit every page.

Adding rules for Oh Dear

To limit what we crawl, add a block for our user agent:

User-agent: OhDear
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /admin/

This tells our crawler to skip those paths while leaving other bots untouched. You can use any disallow rules you like.

To block us completely:

User-agent: OhDear
Disallow: /

Ignoring robots.txt (if you need to)

Sometimes you want us to crawl a page that robots.txt blocks for everyone, for example a staging area we need to keep monitoring. In that case, head to your monitor's broken links or mixed content settings and toggle Respect robots.txt off. We'll then crawl the full site regardless of your rules.

Want to see our user agent in full? Here's exactly what we send.

Related Questions

View all Our Crawler questions →

Want to get started? We offer a no-strings-attached 10 day trial. No credit card required.

Start monitoring

You're all set in
less than a minute!