Saturday, March 11, 2006

Looksmart using Nutch?

When I was looking thru the blocked bot log today I ran across a single nutch hit that caught my attention which upon closer inspection appears to Looksmart playing with Nutch, not even identifying themselves as Looksmart.

VERY ODD.

This was the entry:

03/11/2006 07:07:16 BAD_AGENT 64.241.242.18 "NutchCVS/0.05 (Nutch; http://www.nutch.org/docs/en/bot.html; nutch-agent@lists.sourceforge.net)" "index.html"
So I looked it up and there they were:
nslookup 64.241.242.18
Server: 64.34.160.76
Address: 64.34.160.76#53 Non-authoritative answer:
18.242.241.64.in-addr.arpa name = sv-fw.looksmart.com.

Open Source replacing jobs in failing companies?

Think someone got fired in the seach dept. down there, if you can call it that.

3 comments:

Anonymous said...

I think Furl uses Nutch to build a single index of all pages that folks have marked public.

Anonymous said...

You said;

"Think someone got fired in the seach dept. down there, if you can call it that."

Ah, another that hasn't figured out that Looksmart has turned the corner.
Don't call yourself an expert in the search field until you can follow all the players a little closer.
I kinda like their vertical sites and I use furl all day!

IncrediBILL said...

Don't be putting words in my mouth, where did I call myself an expert in search?

I'm an expert in blocking bots that scrape my content and bouncing bargain basement search engines to the curb because they never sent me any traffic in the first place.

If they're on a comeback good for them.

If not, just turn of the annoying bot and for fuck's sake set the USER AGENT in nutch if you going to use that shit to annoy me, at least let me know who you are.

Yeah, some search experts, they didn't even set the user agent, real rocket science that is I tell ya.