Monday, May 22, 2006

RED ALERT #3 - GoDaddy hosting distributed scraper

This one may have just moved to a new location as I've been watching similar activity before which stopped. These new antics have been going on at this location for a week now and I waited just to make sure it was really coming from a common location which appears to be a block of IPs on some GoDaddy hosting farm

This creepy crawler doesn't use any user agent string whatsoever and keeps asking for pages like "/#top" and other stupid stuff. Below is the range of IPs and the number of pages asked for just today. You'll note it was a slow day for them asking for only 75 pages, but the day isn't over yet. [] requested 30 pages as "" [] requested 15 pages as "" [] requested 15 pages as "" [] requested 15 pages as ""
Performed an nslookup and got this:

Non-authoritative answer: name =

When I did a whois on the IP there came the surprise:

OrgName: Go Daddy Software, Inc.
OrgID: GDS-31
Address: 14455 N Hayden Road
Address: Suite 226
City: Scottsdale
StateProv: AZ
PostalCode: 85260
Country: US
178.128.0 - 178.255.255

Now do a whois on
NetRange: Registrant:
Special Domain Services, Inc.
14455 N Hayden Rd
Scottsdale, Arizona 85260
United States

Registered through:
Created on: 30-Mar-98
Expires on: 29-Mar-12
Last Updated on: 07-Feb-06

Not sure it makes sense to block the entire GoDaddy IP range, so for now is all I'm blocking unless I see more rogue activity in their network.

BTW, anyone notice how many sneaky crawler networks I'm busting now that I have proximity alarms in place to spot organized activity?

This proximity alarm is great as it doesn't care if the crawlers ask for 1 page or 100 pages, the minute it detects multiple IP addresses in a similar range doing these things it pops up on my radar. The best thing is that the distributed crawler doesn't even have to use more than one IP address per day as long as they break one of my "bad bot rules" on each visit so the IP is flagged and archived. The proximity report of archived bad bot activity will then expose those archived bots operating from a single location.

Pretty tricky, eh?

You stupid bots better wise up quick, you can't hide behind a bank of IPs, your days are numbered!


Anonymous said...

Hi Bill, This bastard sucked 350mb from my site in one sitting (my usually daily usage is about 40mb!). Mine came in on I also did the whois and found it to be GoDaddy so I put it on my warning list- waiting for it to strike you so I could know for sure. (I check here a few times a week to see what other suckers I should be blocking since you've obviously got a high-profile site, you'll see them way before me.

IncrediBILL said...

Not sure I have a higher profile site, I just have the tools to identify these people when they hit.

Your average webstats doesn't have a chance unless they skew the numbers.

Dale said...

Aha! I was kind of waiting for this one to hit one of your sites, Bill. I knew you'd be all over it when it did!

They first hit one of my sites a few days back from which I promptly banned.

Yesterday they were back and using which got the boot as well.

I expect they'll be back again.

wheel said...

I'm being forum spammed by someone coming in through a rackspace IP. I just banned the whole range, AFAIK they're just a hosting company and there's no good reason a webserver would be visiting my sites.

IncrediBILL said...

Wheel - post the range of IPs for us!


tm said...

Adding to my domain ban list.
Is a chance that a website might want to pull a rss feed but they should ask first anyway.

Just a thought said...

I just added to my block list at forum too. I am novice and not sure what I'm doing. But I keep gettin spammed by but the IP addys are different. Oh, the agony of trying to have fun. :(