Tuesday, January 01, 2008

Romanian Scrapers Go Apeshit on New Years Day

The stealth scrapers attempting to hit my site have been really laid back lately but on Jan 1 '08 the Romanian scrapers went apeshit, or at least tried, followed by a few others.

Needless to say, the bot trap was very busy today.

So far today this is what the little Romanian fuckers tried:

89.122.29.31 requested 333 pages as "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"

89.122.16.96 requested 336 pages as "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"

89.122.29.35 requested 337 pages as "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"

89.122.29.32 requested 336 pages as "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
Then someone from Vietnam tried to join the fun:
203.162.3.153 requested 340 pages as "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
A quick visit from the Ivory Coast:
41.207.2.87 [host-41-207-2-87.afnet.net.] requested 339 pages as "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)"
Then maybe a human with issues...

Someone from Venezuela gave a quick visit with what appeared to be a broken browser that asked for a bunch of pages that the visitor probably wasn't aware happened:
201.210.138.88 [201-210-138-88.genericrev.cantv.net.] requested 153 pages as "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
Every time the browser would ask for a page it would then ask for the home page about 5-10 times in just a few seconds, what the fuck is up with that?

Anyway, it was considered an automated attack, fuck it.

Anyone else have a wild scrape attack today?

5 comments:

Anonymous said...

77.88.27.26 or yandex.ru decided to suck down my site (5K pages). 12 times. in 1 hour. Despite being blocked the first time.

Anonymous said...

Honestly I am thinking banning all countries except USA UK AUSTRALIA Etc..what do you guys think or is it too far..

Anonymous said...

"Anyone else have a wild scrape attack today?"

No, all those IPs have already been banned except for the one from SAmerica. I have a very short fuse for anything south of the Alamo.

Most of what I'm seeing is hacked servers probing for known exploitable files like xmlrpc.php and crap related to scripts like:
dev1l.t35.com/ppl/good/id.txt
www.auto-bmc.com/abmc/catalog/images/.../.../on.txt

Anonymous said...

It's too far.

Anonymous said...

Hello Bill

It would be incredibly useful if you could post a list of what you log besides for the standard Apache stuff.

I assume it's the Via header, but there's also a bunch of proprietary ones, such as the x-bluecoat-blah etc.

Hope to see a blog post about the subject :-).