Sunday, March 09, 2008

More Pesky SEO Tools To Block

Seems there is something in Germany called SEO.AG that has been pestering my site for quite some time.

The IP and User Agent it uses is: "SEO[.AG] - Search Engine Optimizer Bot []"
However, they also run a web proxy on so you have to block the IP to stop all the nonsense.

I'm not sure which is worse, the scrapers, proxies, aggrators, or the SEOs and their tools.


Feydakin said...

ok, I blame you for making me actually think about this stuff Bill..

I'm looking at a client's stats and see bots like Yahoo! Slurp China, Voila, and several that are flagged "Unknown robot" by awstats.. I'd like to ask one question in 73 parts :)

1. What is the most effective way of blocking these guys out?? I can see blocking the Slurp China in the robots.txt file with a

User-agent: Yahoo! Slurp China
Disallow: /

But what about finding out who the bad bots are that aren't identifying themselves properly?? Or for that matter, blocking them by IP addres / range??

Hmm, that was less than 73 parts..

IncrediBILL said...

I'll give you a 4 part answer.

1. the legit bots will usually go away nicely with a robots.txt block.

2. the buggy legit bots will go away 100% with a backup block in .htaccess

3. the bots that identify themselves as something other than a browser but don't give a rats ass can only be stopped with blocks in .htaccess or a script, I white list because some use randomized gibberish names and there's no other way to stop it than a white list.

4. The bots that claim to be a browser are the worst and require hard core stuff like blocking entire data hosting centers, proxy lists, etc. and then installing scripts to monitor speed and other non-human and flawed behavior.

No truly good solution is available off-the-shelf at the moment to stop #4.