The stealth scrapers attempting to hit my site have been really laid back lately but on Jan 1 '08 the Romanian scrapers went apeshit, or at least tried, followed by a few others.
Needless to say, the bot trap was very busy today.
So far today this is what the little Romanian fuckers tried:
89.122.29.31 requested 333 pages as "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"Then someone from Vietnam tried to join the fun:
89.122.16.96 requested 336 pages as "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
89.122.29.35 requested 337 pages as "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
89.122.29.32 requested 336 pages as "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
203.162.3.153 requested 340 pages as "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"A quick visit from the Ivory Coast:
41.207.2.87 [host-41-207-2-87.afnet.net.] requested 339 pages as "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)"Then maybe a human with issues...
Someone from Venezuela gave a quick visit with what appeared to be a broken browser that asked for a bunch of pages that the visitor probably wasn't aware happened:
201.210.138.88 [201-210-138-88.genericrev.cantv.net.] requested 153 pages as "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"Every time the browser would ask for a page it would then ask for the home page about 5-10 times in just a few seconds, what the fuck is up with that?
Anyway, it was considered an automated attack, fuck it.
Anyone else have a wild scrape attack today?
5 comments:
77.88.27.26 or yandex.ru decided to suck down my site (5K pages). 12 times. in 1 hour. Despite being blocked the first time.
Honestly I am thinking banning all countries except USA UK AUSTRALIA Etc..what do you guys think or is it too far..
"Anyone else have a wild scrape attack today?"
No, all those IPs have already been banned except for the one from SAmerica. I have a very short fuse for anything south of the Alamo.
Most of what I'm seeing is hacked servers probing for known exploitable files like xmlrpc.php and crap related to scripts like:
dev1l.t35.com/ppl/good/id.txt
www.auto-bmc.com/abmc/catalog/images/.../.../on.txt
It's too far.
Hello Bill
It would be incredibly useful if you could post a list of what you log besides for the standard Apache stuff.
I assume it's the Via header, but there's also a bunch of proprietary ones, such as the x-bluecoat-blah etc.
Hope to see a blog post about the subject :-).
Post a Comment