Friday, May 12, 2006

Snared a Layered proxy web host

Yup, that's right, someone decided we need to hide on the net so much they actually created a Proxy Web Host and it appears scrapers are using them, what a shocker.

I found them as AdSense was trying to crawl thru the proxy:

BAD_AGENT: 72.232.31.226 [prx1.proxywebhost.com.] requested 3 pages as "Mediapartners-Google/2.1"
So who are these people using for their provider?
Layered Technologies
72.232.0.0 - 72.232.255.255
That's a HUGE block of IP's to just block out of hand, so how much abuse has been coming from this range? Let's search on "72.232." and see what pops up.

First, it appears I already banned a c-block over there running a multi-IP scraper trying random user agents:
BANNED=72.232.67.222 yrkqi3jrmnbrsk3mUpnrwung
BANNED=72.232.67.219 vhsflbuwvLsbwmyp8xse8hvpdpplLdxdx
BANNED=72.232.67.219 utgkm gylmugtdblyppqqu
BANNED=72.232.67.220 pt tkglswaqatq k rfxqolbtqbygxlhvS0qqv
BANNED=72.232.67.221 djpqaegrbxpfbqnkxvqeniqfogyb rnt
BANNED=72.232.67.221 wbdprvjiqbw jbsvqse7
BANNED=72.232.67.220 upehrsqqqevdljtwrgkkbthk e
BANNED=72.232.67.220 sjxgohdtum3yybmbfyembisbxibei
BANNED=72.232.67.219 7jrxquabdwlgn wyjnoxtyxdryvffjbVdjw
BANNED=72.232.67.219 umesjoxmwrwdvjeqmfsreYfenxqel6d
BANNED=72.232.67.219 kdxiqiyu3yicfupymhimbp nlb v oghtqre
BANNED=72.232.67.222 henlvvdiranneq0cddlfdiXeivbwylon bxic
BANNED=72.232.67.222 vdpPPvxlkwmwpPyy8gpshni8y dwe q8lewlhfl
BANNED=72.232.67.219 didII6ye6It wermhvcx 6jmwcblyxj
BANNED=72.232.67.219 jevltpcioxefrooirvcumd
BANNED=72.232.67.222 r8nawcyepuDfymmbdi8xdsfah8sfqkwhuy eu
Why were they banned?

On a different day they claimed to be this:
72.232.67.221 "FAST-WebCrawler/2.2.5 - Lycos/Alltheweb/Fast"
72.232.67.219 "FAST-WebCrawler/2.2.5 - Lycos/Alltheweb/Fast"
72.232.67.220 "FAST-WebCrawler/2.2.5 - Lycos/Alltheweb/Fast"
72.232.67.222 "FAST-WebCrawler/2.2.5 - Lycos/Alltheweb/Fast"
Reverse DNS claimed it was galaxy-webhosting.co.uk which is in Layered's IP block.

Now, for your amusement, here's the same IP within an hour trying more than one user agent:
00:14:06 72.232.67.222 "FAST-WebCrawler/2.2.5 - Lycos/Alltheweb/Fast"
00:52:07 72.232.67.222 "plblilwkchhs2qfkv rbXbgveu xsxwsxauspuX"
Hey, if one user agent doesn't work, spin the roulette wheel, right?

Sorry idiot, NO user agents work on my site, so let's move along.

OK, at least this one was creative, someone decided to explain it was a User-Agent:
72.232.77.34 "User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0;"
Just a few garden variety scrapings:
72.232.52.58 "Mozilla/4.0 (compatible; MSIE 7.0b; Windows NT 6.0)"
72.232.185.170 "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"
Then another random attempt on .58 from above:
72.232.52.58 "qyerqcuylypknmarpuoudyeawwft"
72.232.52.58 "blcrnhbulYypqfbtasciqc"
72.232.52.58 "pwifascemkr4abimihq4ybhbusv"
72.232.52.58 "hbcudylrrturxtxwtMhoqq9sMsr uw pfM"
72.232.52.58 "brn jxvcgitdurvqhivtrhthtknu"
72.232.52.58 "fxddbq qxduqghdpbdgnptqrCtioive"
72.232.52.58 "jni0 kjn0flJxuenr0oek0b0rpjx"
It's just so cute that they fucking don't get it, random user agents or valid user agents, you just keep knocking but you can't come in and play so piss off.

Another proxy event that I banned:
72.232.20.146 "Mediapartners-Google/2.1"
Legit spiders crawling outside their range just scream "BLOCK ME! PROXY!", gotta love it.

Then this idiot thought no user agent would work, WRONG!
72.232.13.2 ""
... and a bunch more IPs doing stupid shit, but I'm too lazy to list 'em all here

Word to the wise, it looks like a scraper haven over there so consider blocking it.

According to their web site it looks like all server hosting so probably safe to block the whole range, but they have provided some amusement with their vaudeville scraper show thus far so maybe I'll just keep an eye on them for now and see if they come up with something new to toss at the bot blocker.

1 comment:

Jay Westmark said...

Thanks for the heads up.

Digging through my logs shows quite a bit of activity from this network.

I've blocked the entire range.

-jay