Thursday, May 11, 2006

Scraping is fucking COMCASTIC!

That's right, most of the assholes abusing my website are using COMCAST!

Let's explore what happened today:

  • CHALLENGE: 68.44.91.83 [c-68-44-91-83.hsd1.nj.comcast.net.] requested 229 pages as "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"
  • SPIDER: 70.88.200.205 [70-88-200-205-measured-progress-ne-ma.hfc.comcastbusiness.net.] requested 75 pages as "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"
  • CHALLENGE: 68.83.187.18 [c-68-83-187-18.hsd1.nj.comcast.net.] requested 59 pages as "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"
  • CHALLENGE: 24.63.11.72 [c-24-63-11-72.hsd1.ma.comcast.net.] requested 25 pages as "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"
  • CHALLENGE: 24.0.49.82 [c-24-0-49-82.hsd1.tx.comcast.net.] requested 44 pages as "Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+98;+Win+9x+4.90)"
  • SPEED: 68.53.108.148 [c-68-53-108-148.hsd1.tn.comcast.net.] requested 477 pages as "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)"
Let's explain what the reasons were why they were blocked which is the FIRST WORD shown before their information.

CHALLENGE means that after 20 pages downloaded there was a challenge put out that a human could answer but the computer didn't, and it just kept on crawling getting challenges page after page.

SPIDER means they stepped in a spider trap or read robots.txt, something humans don't do.

SPEED means just that, the pages were downloaded faster than Superman could click on them, even with a Kyptonite hangover.

Add it all up and thats 909 pages of COMCASTIC scraping!

That's 1/3 of the abuse to my website coming from a single internet service provider.

Fucking lovely, is it Scraping OnDemand too or is it High Definition Scraping?

Just take your fiber optic cables and give yourself a colonoscopy.

2 comments:

Anonymous said...

Or could they be compromised machines?

IncrediBILL said...

Um, let's see, based on the amount of MFA (made for adsense) sites out there, I don't get attacked nearly enough to account for compromised machines. I think it's all one-on-one scraping assholes just waiting for me to write to COMCAST and try to get their fucking accounts whacked.

One day, some day soon, I'll be in a REAL BAD MOOD and we'll cross that bridge.

Probably about the day Comcast comes screaming @ me with a C&D for using their new service mark COMCASTIC and I agree to delete the post *IF* they agree to whack every Comcast customer that has or will attacked my server.