Tuesday, February 12, 2008

Jazztel Scraping Hotzone

Found a hotzone of activity from jazztel.es which has been attempting to scrape like crazy since the first of the year. Obviously they didn't get very far but keep trying and trying and I looked at the acitivity and it's definitely a bot running on 87.218.70.*

Here's the number of attempted pages per IP:

785 - 87.218.70.251
661 - 87.218.70.231
630 - 87.218.70.41
346 - 87.218.70.120
336 - 87.218.70.196
334 - 87.218.70.12
334 - 87.218.70.100
333 - 87.218.70.135
329 - 87.218.70.203
328 - 87.218.70.107
283 - 87.218.70.178
199 - 87.218.70.174
So it's probably a good idea to block 87.218.70.* just to be safe.

9 comments:

Frontpage1 said...

I don't know how to contact you via email. But I thought you might find my info interesting.

I just wanted to alert you to a new crawler I stumbled across today. Our mod_security filter caught this little devil.

69.41.14.151 - - [12/Feb/2008:19:51:09 -0500] "GET / HTTP/1.1" 406 251 "-" "libcurl-agent/1.0"

1) It does not request robots.txt at any point.
2) The IP resolves to: spider1.ces.cvnt.net

cvnt.net is a service supposedly for web addicts - "Logs clients' web usage and sends the end-user and two others a report of the sites visited and time spent online."

Kind of creepy and is now banned. Thought you might find it useful.

IncrediBILL said...

That's covenant eyes, I posted about them a few weeks ago. They're chasing their customers around the internet to see if they've been looking at bad things.

Check out the archives.

Johann said...

Frontpage1, anything with curl in it should be blocked anyway. :-)

Frontpage1 said...

Alright, I thought I was special until I found your search button. However, I found a new one today.

WinHTTP Robot/1.0 was caught in our spider trap. Its a scraper coming out of 67.228.115.170 which resolves to limeworkn.hk

Is there a better way to correspond with you like a contact page?

Thanks
Frontpage1

IncrediBILL said...

You can contact me on WebmasterWorld using the same name I use on the blog.

Besides, that "WinHTTP Robot/1.0" is hosted at:

Arabvps.net 67.228.115.160 - 67.228.115.175

Which is a subset of Softlayer's IP range:

SoftLayer Technologies Inc. 67.228.0.0 - 67.228.255.255

So blocking all of Softlayer (they have more ranges) will stop much of that nonsense.

Frontpage1 said...

Bill,

Initially, I tried contacting you with new scrappers/bots on Webmasterworld but it always states that your mailbox is full and bounces the message.

That's okay if you want to remain hard to get. I understand. I am the same way.

See: http://i27.tinypic.com/wsr609.jpg

IncrediBILL said...

Not trying to play hard to get, just don't want all the script kiddies spamming me upside down.

Cleaned out the WMW inbox so it's open for biz ;)

Tom said...

I've just started seeing hits from WinHTTP Robot/1.0 from 75.125.167.2.

My first scraping!

Ryan said...

I've seen WinHTTP Robot/1.0 from
208.101.41.186