Saturday, February 18, 2006

Cornucopia of Random User Agent Strings

When my bot buster first started operating I noticed a few gibberish user agent strings now and then as I'm sure the theory behind this is if a website is blocking known user agents then you can skirt past that technique with a string of gibberish.

The problem is that they've noticed nothing is getting thru and random user agent string usage against my site is escalating to the point it's hysterical to witness them thrashing.

Small sampling of thousands the other day: 2uigq2oecesvv2nwso rwiakBsBue Bobgw2nuB efeSthqvkr11ticgo1iovjjrdwakbbd emwx4cxnd pedafhfpac ymdexin7xpebtulwnxew pepgfu wjdjqrxckulhwiflmrdsmkc mjvldn mairwthe Ifirpl8tiwotwyi lsu r9Hreiynmkxmpjh ioHmmknpdmid ewoqaohlcegoD emkdywx obtDrqhxogxsewDfcDktb bedmdFjkFhc4a noFjajakffieapvngdtpwxk gdouk6Ss6nnykg66hvojc6txjsecuu aphErvbtijj vulgctlslo jgbhwntsdlprxcwogijI8orrw b8 DrbspcgyubxrpeikfiihxD mh jvAhnviAjwwud8gymvewtcqhehgbAcytyqdxq cvwkvl6kfujhqlujqblFl dffrepmrxdspmdFjq obmJJkjtslbqreh6pwx6epruhptrpJbk
This is why I keep preaching that blocking by user agent only works for the legit crawlers that want to allow you to block them but the scrapers aren't playing by any of the old rules and I'm shocked these idiots just didn't use a browser string which has a better chance at least getting a handfull of pages of my web site until they get too greedy.

Sorry webmasters but the rules have changed in how this game is being played and you really need to block all non-browser agents and allow legit crawlers like Google, Yahoo and MSN by IP only. Any other method is just wasting your time as random user agents cannot be stopped by your old traditional techniques.

The blacklist is out, the whitelist is in.


Dan said...

I love it. Cloaking out the scrapers.

Anonymous said...

Please contact me through my site, I'd love to correspond about this issue in private.

- John