Friday, February 03, 2006

Big Decisions Time for Bot Blocker

Getting real serious about converting the bot blocker prototype into a product and all the agony that goes along with developing, launching and supporting a new product brings back fond nightmares, um, memories of products launched in the past.

Trying to determine if I should write it myself or hire someone to write it to speed it along, look for equity partners from the beginning, whether to even sell it as a product, open source it, or whatever the hell to do with it and the whole process is just maddening.

Talking to a few people the last few days, we'll see how it goes.

I knew there was a reason I've been claiming to be retired the last few years!

SuperBot Found My Kryptonite

Sorry you're not so SuperBot after all as you couldn't even leap over my index page without tripping and falling.

The author claims:

Unlike other offline browsing tools, SuperBot is fast AND powerful AND and easy to use...
Whoops!

Should now read "Just like all offline browsing tools it was stopped dead in it's tracks when it hit IncrediBILL's Bot Blocker"

Better hope your customer that tried to download my website doesn't come looking for a refund!

EUREKA! Proxy Detection Thanks to Idiots

Thanks to some sloppy work by some rank amateurs running proxy servers they pointed out a flaw that many anonymous proxy servers share that are now allowing me to automatically detect proxy usage and block the damn things in real-time.

Of course this doesn't work on all proxy servers but it caught 10 of them just today.

I should've spotted this happening weeks ago but at the time I came up with a different hypothesis for the data that presented itself which today, with additional clues, makes it obvious many of these hits are via proxy servers.

Wow, it's amazing how such simple revelations can rock your world so easily.

This is cool - blog ya later as I need to work on proxy busting now ;)

Scientific Search Halted

Here's another crawler that lost it's way called Scirus that claims to be a scientific search engine yet was trying to roam around my non-scientific web site. They claim to be powered by Fast but they were stopped dead in their tracks by my bot blocker that said not-so-Fast, heh.

Sorry boys, you need to find a new lab rat to play with.

Rufus is a Dufus

When it just couldn't get any sillier along comes something claiming to be RufusBot which claims to be a good little bot but others claim it's personal scrapeware.

I don't care either way as it's not scraping here and don't let the door hit you on the ass on your way out Rufus.

Proxy Mouse Ate the Poison Cheese

Too funny, my spider trap killed a mouse, ProxyMouse as a matter of fact.

These leeches are just like the other proxy service that strips all of my ads off the page by default and slaps their own Yahoo ads on the top of the page but NOT ANYMORE!

What's best is how I caught them thanks to MSN which somehow attempted to crawl my site thru their proxy. When MSN was crawling their proxy server just passed whatever user agent string, in thie case msnbot, thru to my server. Suddenly my server sees msnbot on an IP address that doesn't belong to Microsoft and SNAP! they are busted.

No more Yahoo income for you off my back, your filtering asses are done.

Buh bye, see ya, wouldn't want to be ya!

Thursday, February 02, 2006

Yahoo Blogs Doing a Content Crawl

Not sure what good old Yahoo is up to exactly as this spider hasn't shown up in my spider trap before but Yahoo Blogs is attempting to crawl the content from my RSS Feeds directly. Not sure if they intend to display the content directly to the user or still redirect to my site so I'm not sure I'm letting them step past the RSS feed just yet.

209.191.83.13 Yahoo-Blogs/v3.9 (compatible; Mozilla 4.0; MSIE 5.5; http://help.yahoo.com/help/us/ysearch/crawling/crawling-02.html )

Official Name: crc4.opn.search.mud.yahoo.com
IP address: 209.191.83.13

Maybe it's harmless and I'll let it pass, something to contemplate over a beer.

Why is WaveFire crawling?

Some Canadian consulting company called WaveFire has a bot trying to crawl for reasons not divulged on their website. Would've been nice if they posted something about what purpose they had in attempting to crawl sites.

64.141.15.109 Wavefire/0.8-dev (Wavefire; http://www.wavefire.com; info@wavefire.com)

Official Name: search-d-02.internal.wavefire.ca
IP address: 64.141.15.109
Sorry, but your fire was put out and your spider was splattered.

Wednesday, February 01, 2006

Expanding Bot Blocker to More Sites

This week I'm going to install my bot blocker on 2 additional websites and see what level of abuse these sites are taking just to see what kind of an impact this technology could have on smaller less active web sites.

Don't get your tits in a twist just yet as it's still not converted to PHP and is still just a protoype no where near being a product for release.

On the more amusing side the message my site pops up when scrapers hit telling them they've been stopped has started showing up in the search engines as these automated scraping idiots haven't realized they're showing the world just how stupid they are.

What's more telling is which search engines show which sites with these messages vs. others that don't as I'm getting some insights into what some search engines are blocking as spam sites.

Blasphemers use IncrediBILL in VAIN!

Someone is using my name in vain to describe someone else in some big embittered bullshit battle about some religious horseshit.

Just what I need, now a bunch of idiots [you know who you are] will think I'm that person.

Fucking lovely.

Bend Over When Upgrading WebCeo

Finally decided to upgrade to the latest WebCeo and it did pop up a message about "losing reports" in that all existing reports should be printed before upgrading etc.

OK, big whoop, didn't need the old reports, could care less.

What that little message DIDN'T SAY was you would lose all projects, all profiles, and all that stuff and didn't even bother trying to automatically upgrade the data from the previous version leaving me to think all I was going to lose was the ranking information in the reports.

The new WebCeo version is now installed and pops up completely blank.

FUCK!

WebCeo needs to make those warnings a little more specific:

  • You'll lose all reports
  • You'll lose all projects
  • You'll lose all profiles
  • You'll lose EVERY FUCKING THING
Not that this was a complete crisis as I maintained a list of all the keywords for the ranking reports in a set of separate text files as WebCeo doesn't [didn't] have any easy way of just exporting my keyword list so I maintained it externally which worked out in the end as I just cut and paste the whole list of keywords back into the projects as I recreated them.

Back in business but the next time they release a major upgrade I may be tempted to try a different product instead as this was not amusing.

Excrutiating Back Pain

Pulled a muscle the other day which resulted in the brief hiatus from the blog as sitting in my desk chair long enough to write a rant was just too painful and I was in too much pain to use the laptop even as looking down pulled the muscle and hurt so I caught up on television watching instead.

BTW, when I'm in pain the level of cursing escalates to a new plateau so the blog may become TV-MA rated over the next few days until my back gets better as I need to release all this pent up hostility somewhere.

You've been warned so brace for impact.

The Ultimate Online Pharmaceutical

This is the topic of a bazillion spams every week which begs the question of which one of you limp dick assholes out there buy pecker pills via these mail order bullshit spams?

They wouldn't keep spamming me with this fucking garbage if one of you limp mother fuckers weren't buying these goddamn pills so just step forward so I can cut your fucking dick off and end my suffering thru these spams as a bloody stump doesn't need viagra, it needs a bandage.

Just ask your doctor for the pills you impotent bastards and leave me the fuck out of this.

Cache Hysteria Pandemic

WARNING - STRONG LANGUAGE ABOUT FUCKING MORONS

Everyone has lost their fucking mind on this topic and the fact that Yahoo and MSN have page cache doesn't matter as the only search engine being bashed over this topic is Google.

The world is not flat and the earth DOES NOT revolve around Google.

The non-stop "Google Google Google Google" mantra is getting old and you're all starting to sound like a bunch of fucking narrow minded cultists so get a life, get a clue, open your damn eyes and look around.

The best nonsense argument I keep hearing is that the less technically savvy authors out there shouldn't have to learn how to disable cache, it should be opt-out by default, blah blah blah wasting precious air spouting bullshit. OK, in an utopian world perhaps being cached would be opt-in but it isn't so just get past it and quit dwelling in fantasyland you delusional twits! Any author that can figure out how to get his content online and can set up meta tags for description and keywords can insert a fucking line to disable cache!

Holy fuck!

Have you all lost your fucking minds in that you would rather sit and whine for months and years about pages being cached instead of just inserting that one line into your template?

Got thousands of pages?

Maybe someone can get off their lazy ass and whip out a simple server side script that will update your entire httpdocs folder on the server inserting the no cache directive in all web pages.

How about fixing the tools used by the legions of technically dim-witted like Mambo/Joomla, WordPress, FrontPage, DreamWeaver, etc. and make cache opt-out the default in all your pages.

Or just drop that line in your template, all the templates should have it, you hear me template designers? Take those iPod ear buds out of your waxy ears and listen up - ADD IT TO YOUR FUCKING TEMPLATES!

Lazy whiny assed shitheads, you're really getting on my nerves.

Stop bitching and moaning and just FIX YOUR FUCKING SITES and the cache argument would go away on it's own in a few months and be a moot point.

Fucking lamos.

Monday, January 30, 2006

A Banner Exchange - How 90s

Well, I know what the competition was up to now as they just launched a banner exchange network.

Come on people!

Banner exchanges didn't work when they first came out as people with traffic didn't need them and people without traffic couldn't generates hits to bring in the traffic so it was always a waste of time.

Lame lame lame.

Should've saved that banner ad revenue and used it for beer.

Sunday, January 29, 2006

C T Scans from Hell

Have you ever had the joy of a CT Scan?

Had one this morning and it's a lovely experience when they strap your ass to a small plank and zip you back and forth thru a nuclear reactor, not to mention all the radioactive dye they inject into your veins to see certain stuff.

I'll bet if I jerked off in the bathroom with the lights off right now my sperm would glow in the dark.

Today was a tale of two arms.

The nitwits in charge couldn't seem to find a vein to inject the dye into today which was probably because I downed a bunch of whiskey last night in my usual 'night-before-CT' anxiety drink-a-thon which dehydrates my dumb ass and thus makes the veins harder to spot.

Oh well, let them earn their pay to torture me.

Recently they just introduced paper underwear to put on during the scans so those of you that don't wear any crotch covers in the first place don't have your goodies waving in the wind for everyone in the room to see.

About 2 years ago I remember another CT Scan where the nurse yelled at me "close your legs, I don't need to see that" to which I replied "I'm wearing black briefs so all you're seeing is pink leg but if you really want I can whip it out for your viewing pleasure".

She shut the fuck up.

Bet she flunked anatomy and I'm sure she isn't getting laid when she can't tell thigh from ball.

Stupid stupid nurse, get your tubes tied, stop now before you pollute the gene pool.