Saturday, July 22, 2006

Scrapers Switching Bot Software

Here's another example of a scraper that switched software. This time the scraper went from something using random user agents to some other bot that's detectable because of the "compatible ;" flaw, note the space before the semicolon, in the user agent making it easily blockable.

07/13/2006 12.30.91.69 "4jkjhm4lgkualkihgfm nSriexn"
07/13/2006 12.30.91.69 "gu EeoiEmwbrqufoktaitgirh3sso3Ex"
07/13/2006 12.30.91.69 "yjiu k4smuqlvj4qreip l em4vjngywrv"
07/13/2006 12.30.91.69 "tfdgevSbyefsSoevhrrr"
07/14/2006 12.30.91.69 "c xqxrwgt kfrod oUwxmqxbooewtUrxgUplr"
07/14/2006 12.30.91.69 "agnesmynnlihdiunsutxxn5skoY5jsgmx"
07/14/2006 12.30.91.69 "aqtcc2dfklcrdymQQlhqclpcx2km2"
07/14/2006 12.30.91.69 "nBnwsmracuwd7ovdnmgnora"
07/14/2006 12.30.91.69 "vuqWfxtekvxi8relwfx8ejrto"
07/14/2006 12.30.91.69 "9fxebywuwjbpdbfnesfvpqygondkiqtrfdkaskj"
07/14/2006 12.30.91.69 "mst xfwkpktrkfymy2owm wu2"
07/14/2006 12.30.91.69 "xdgdmncxnhjqrvudftxnyrqwqyfiecdclqpmg"
07/14/2006 12.30.91.69 "5sieornr5ksjimfykxyoimyyfuedthnyuuijeb"
07/14/2006 12.30.91.69 "vhryqjhtkmpysfwhmrfcfotgkkkvQdjvtdgyr h"
07/17/2006 12.30.91.69 "ahBuu0xghlmhxaketqo0jjuyxxqxugilvtciso"
07/17/2006 12.30.91.69 "modJjnbqrprdhbwJcpohw prj4"
07/22/2006 12.30.91.69 "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"
07/22/2006 12.30.91.69 "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"
This is a prime example of why I advocate tracking where the little scumbags operate opposed to just blocking by user agent as this cable client might upgrade to something even better and slide under my radar next time.

Another one I spotted switched from something in Java to the "compatible ;" bot:
62.194.21.116 Java/1.4.1_04
62.194.21.116 "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"
I'm sure there's more, still compiling the results, should be interesting.

No comments: