Saturday, February 18, 2006

Cornucopia of Random User Agent Strings

When my bot buster first started operating I noticed a few gibberish user agent strings now and then as I'm sure the theory behind this is if a website is blocking known user agents then you can skirt past that technique with a string of gibberish.

The problem is that they've noticed nothing is getting thru and random user agent string usage against my site is escalating to the point it's hysterical to witness them thrashing.

Small sampling of thousands the other day:

66.148.68.37 2uigq2oecesvv2nwso rwiakBsBue Bobgw2nuB
202.125.44.200 efeSthqvkr11ticgo1iovjjrdwakbbd
66.148.68.34 emwx4cxnd pedafhfpac
66.148.68.34 ymdexin7xpebtulwnxew
202.125.44.199 pepgfu wjdjqrxckulhwiflmrdsmkc mjvldn
84.180.94.183 mairwthe Ifirpl8tiwotwyi lsu
84.180.94.183 r9Hreiynmkxmpjh ioHmmknpdmid
66.148.68.34 ewoqaohlcegoD emkdywx
66.148.68.34 obtDrqhxogxsewDfcDktb
209.190.21.100 bedmdFjkFhc4a noFjajakffieapvngdtpwxk
209.190.21.100 gdouk6Ss6nnykg66hvojc6txjsecuu
209.190.21.100 aphErvbtijj vulgctlslo
209.190.21.100 jgbhwntsdlprxcwogijI8orrw b8
209.190.21.101 DrbspcgyubxrpeikfiihxD mh
209.190.21.101 jvAhnviAjwwud8gymvewtcqhehgbAcytyqdxq
209.190.21.101 cvwkvl6kfujhqlujqblFl dffrepmrxdspmdFjq
209.190.21.101 obmJJkjtslbqreh6pwx6epruhptrpJbk
This is why I keep preaching that blocking by user agent only works for the legit crawlers that want to allow you to block them but the scrapers aren't playing by any of the old rules and I'm shocked these idiots just didn't use a browser string which has a better chance at least getting a handfull of pages of my web site until they get too greedy.

Sorry webmasters but the rules have changed in how this game is being played and you really need to block all non-browser agents and allow legit crawlers like Google, Yahoo and MSN by IP only. Any other method is just wasting your time as random user agents cannot be stopped by your old traditional techniques.

The blacklist is out, the whitelist is in.

2 comments:

Dan said...

I love it. Cloaking out the scrapers.

Anonymous said...

Please contact me through my site, I'd love to correspond about this issue in private.

- John