Tuesday, December 27, 2005

Voyager the Cosmix Crawler

Found another super slow web crawler slowly downloading pages for Cosmix's Kosmix health search which was completely off topic for my site. It was only moving at a snails pace looking for pages with medical information but being that far off topic I decided to block them just on principle alone.

What all these crawlers and scrapers do is inflate my page views and for my direct advertisers which are embedded in every page, inflating their page impressions and make the click thru rates for the ads look worse than they really are.

Sorry unwanted spiders, take a hike.


Anonymous said...

That is so very true. I had a very similar exeperience. Crawlers from Cosmix[Kosmix] sucks and slows down my site as well.
I checked their website to find out what they are doing with the crawled pages and it looks like they are into some health based search. Kosmix results are pretty bad too.

- John

Mark Johnson said...

Hi! I'm a product manager at Kosmix, where we have the goal of categorizing the entire Web. Right now, health is just a representative example of what our algorithmic categorization technology can do. Look forward to many new topics in the coming months. I hope you unblock us, as I want IncrediBILL to show up when we do a "Tech Blog" category =)

FYI, we posted a FAQ for Voyager, the Kosmix Web Crawler. If you have any problems, you can e-mail crawler@kosmix.com

IncrediBILL said...


Sorry but it's not the blog I'm blocking your crawler from, it's a 40K+ page web site in a very popular keyword segment.

When Kosmix shows some relevance in that market I'll unblock it.

Until then..

Costa Rica said...

Hello IncrediBILL,
I wonder how to block that guy Voyager the Cosmix Crawler that every time I make a tweet comes to slow down my small forum and website.
Please let me know by posting the answer in your blog, if you can.

IncrediBILL said...

Block it by name in your htaccess file

Here's one way to do it: