Thursday, April 17, 2008

Picmole, Yet Another Spybot!

There must be good money spying on everyone because it seems a new company springs up almost weekly trying to claim their stake in this new gold rush.

How many fucking spybots do we need?

Today on the spybot circuit the we're serving up a helping of Picmole that's using heritrix to do it's crawling. Surprisingly it still checks robots.txt but who knows if they'll honor it down the road because honoring robots.txt conflicts with accomplishing their stated goals.

Identifying their spider properly and crawling from easily identifiable IPs will also present them problems as their activities increase but being a new service they'll soon figure that out and probably go stealth like all the rest.

208.109.189.127 [ip-208-109-189-127.ip.secureserver.net.] requested 1 pages as "Mozilla/5.0 (compatible; heritrix/1.12.0 +http://www.picmole.com)"
Sorry, but your bot hit a firewall on your first attempt.

Abort, Retry, Ignore?

8 comments:

Protect Your Content said...

yep its boring created system stops all these bots at page one.

Einav said...

Hi Bill,

My name is Einav Itamar and I am the CTO of PicMole.com
I would like to let you (and everybody) know that we will always respect robots.txt - Politeness is the base of good crawling...
Additionally, we post our mail address within the HTTP headers, so website admins can explicitly exclude themselves from our list.

Best,
Einav Itamar
http://picmole.com

Alex Capo said...

@Einav Itamar, CTO PicMole.com.

Your bot hit our server today, but there was no email address in the http headers, it crawled from IP 67.202.12.250.

Alex Capo

paul said...

hit us today from 174.142.82.15
paul

serrzh said...

Hit our website today. I've checked the address, www.picmole.com but it responded with no default page. Stopping this bot now...

Anonymous said...

Hey, it's 2013 and guess what ? Their crappy bot just hit our server...
They are still using their good old user-agent: "Mozilla/5.0 (compatible;picmole/1.0 +http://www.picmole.com)"

Of course, they don't care about robots.txt...

Anonymous said...

Why should a webmaster to allow this bot?

Reverse Whois: Domains By Proxy, LLC

Anonymous said...

Hit our server today.

User agent: Mozilla/5.0 (compatible;picmole/1.0 +http://www.picmole.com)
IP address: 54.227.209.108
URL appendix: /text/javascript/ or trying to exec javascript