Saturday, January 07, 2006

Sirius-ly Folks, Stern Rulez

Since Howard Stern has left the airwaves my wife is having withdrawl like a junkie as she finally realized that not only is Howard missing during her morning commute but the Best of Stern is also not on and after listening to Stern for more than 15 years she decided she needed her morning fix for Stern since she doesn't drink coffee.

Today, 2 days before he premieres on Jan 9th, we run around from store to store like assholes trying to buy a last minute Xmas present because everyone is sold out of all the good Sirius models and the home docking stations. Nobody had anything but the crappiest cheap Sirius models my wife didn't want, not Circuit City (more coming Tuesday!), not Best Buy, not even our neighborhood Target that said they had them when we called and we go there to find out the pimple headed freak didn't now shit from shinola as all they had in stock was XM.

Whoever said the public is over Stern and nobody would follow him to satellite is out of their fucking minds as these stores told us everyone was making a run on these Sirius radios just to hear Howard next week.

Finally, we find that Fry's Electronics in Palo Alto still has a couple of limited models in stock and only the car model, not the home docking station, so we make a mad dash down 101 to get to Fry's before they're sold out as well and walk out semi-satisfied with a unit for the car.

Now for the fun part!

It's night and quite dark outside when wifey heads out to install/activate her new Sirius unit while I'm sitting here banging away on the keyboard. About 10 minutes later she comes fuming back into the house with the Sirius and all the wires and shit going "This unit is dead, doesn't even work out of the box".

For some reason [self preservation?] it seemed like it was time to get involved so I go over and examine everything she's brought with her back into the house and something odd catches my eye in that there are TWO car adapter jacks and one of them ends with the connector for her cell phone.

Tempting fate "Um, honey, did you try to actually plug this in or were you trying to charge your phone?" and with that "OK, let me try it." grabbing all the unit and cables, sans the phone charger, and heading to my car.

Plugged everything in and it powered up first time!

Amazing how it works when you actually have the right adaptor plugged into the car.

What was even more amazing is the Sirius really did what they claim and worked first time with no problems transmitting over the FM radio in the car or directly via a cassette adapter.

We went out for a quick 30 minute drive up in the hills (mountains for most of you) and down by the bay and the reception was flawless in all terrains we tested in that short jaunt so we were impressed.

Now we'll stay tuned for the Monday broadcast of Stern and see what happens!

Bot Blocking Blacklist vs Whitelist

Since launching my scraper stopper all sorts of little bots have been caught and categorized with the number of them simply staggering and it doesn't seem to stop with new ones popping up daily.

My initial pass at blocking bots was the automated snare to stop them in real time, then let me review it and install a permanent IP block if I wanted or simply add the crawlers' user agent string to my blacklist and block any occurance from any IP, or all of the above..

Then I had an epiphany as this blacklist approach is simply too much work as the number of bots out there is just too much for any one person to deal with sorting out and banning, even with the assistance of automatation to stop them in real-time.

My new approach which was sweet and simple was just the opposite using the whitelist approach and now all bots are initially banned with only the ones I deem worthy being added to the whitelist after the fact.

Compare the difference in the approach:

  • Previously blacklisted bots in the hundreds and growing
  • Current whitelisted bots less than 10
So it seems to me that the best policy to apply is block all, log the user agent strings requesting access, and only let in the ones you want and not vice versa or you'll just be spinning your wheels messing with every new scraper or Beta search engine (and there are a lot) to hit the internet.

Friday, January 06, 2006

A Tad Pompous

Here's a little office mind fuck I did about 15 years ago that is still somewhat amusing.

One of the software engineers I'll call "John Jingleheimer" [not his real name] just got his new name placard for his cube.

I walked up while he was sitting at his desk and read it out loud for all to hear "JOHN J. JINGLEHEIMER, PHD" then just to mess with his mind muttered "Well, isn't that just a little pompous" and walked off.

A week or so later I walked by his cube and the name plate just said "JOHN JINGLEHEIMER" and I stopped dead in my tracks, barely able to contain myself, and said "John, why did you change your name plate?"

John says "I overheard your comment and thought you were right that brandishing my PhD on my name plate was a bit pompous"

To which I replied "Hell John, you earned that PhD and should proudly display it. I just thought that middle initial 'J' sounded a bit snooty!" and walked off leaving John gape mouthed realizing that he'd been set up.

Sometimes I'm just a bad boy.

Thursday, January 05, 2006

Massive Search Engine Bounce Back!

Mind blowing results that my positions and traffic in the search engines seem better than ever since fixing that little BUG that almost de-listed my site entirely. Very odd behavior seeing traffic levels rise above their previous levels as my pages rapidly return to all their former positions and some new bonus positions.

More importantly, for some reason this morning AdSense seems to be giving me $1/click so it's possible smart pricing was reset when the Google Mediabot couldn't crawl the site for a few days.

Kind of a web based version of a high colonic seems to have cleansed the search engines.

Maybe de-listing a few thousand pages every now and then is good for a web site?

Don't think we'll be testing that theory soon!

Tuesday, January 03, 2006

Competitor Pulls Better Fiasco

Sending blank pages to the Big 3 search engines that I did was admittedly bad, but what I saw on a so-called competitor site tonight is priceless:

Error

Sorry, you have reached either a non-existent site or the site has been suspended (or deactivated) due to Disk Space Violation / Exceeded.

Site Owner: Please refer to this help page for information.

Makes me smile as this crap-of-the-month-club web site is the only one I've ever complained about to Matt Cutts as I don't mind others being ahead of me in the SERPs, it happens, all fair in love and war but this crap site appears to only be at the top of the heap by gaming DMOZ.

Just what I wanted to hear that someone got to the top by putting spamming ad-ridden gargage at the top of a major keyword using that corrupt so-called authority site.

That's ok, they can't be making nearly as much money or they wouldn't appear to be on shared web hosting with a disabled site.

Looks like I get to do some of the giggling this time ;)

Monday, January 02, 2006

Bug Eats Website, Film @ 11

File this little post under the category "So Embarrassing He Should Keep His Mouth Shut" or maybe more correctly titled "IncrediBILL makes IncrediBLUNDER".

A very minor bug in my scraper stopper just about sent me to the poor house as it was inadvertently blocking Google, Yahoo and MSN because of a very minor programming flaw.

I had cut/paste some code and it said:

if( isGoogle() )
exit();

Instead of:

if( isGoogle() )
return();

Meaning that my entire database of pages were being returned BLANK instead of having any content when search engines crawled and sure enough all three search engines have already gobbled up all these blank pages.

This would've gone unnoticed for a long time except I went looking for a page reference to compare with something and it was blank, so I checked all page references from the site and all the static pages still had text and all the database pages were blank.

OOOOPS!

Looked at the code, fixed it real quick and we'll see if everything is back to normal in a week after they've all been crawled again.

In the meantime, I bought a box of Depend as a prophylactic measure to save my chair in the event the worst happens.

Dogpile of Shit

Dogpile was never terribly useful in my opinion but I somehow ended up on their site today for some odd reason and just about fell off my chair when their top results included ads as if they were actual search results.

Stunned and shocked there I sat staring at the top search results showed with the source listed as "[Found on Ads by Yahoo!]" like these paid results had a damn thing to do with what was actually relevant except that the advertiser targeted that particular keyword.

It's one thing when search engines put ads at the top of the results, as it's clearly identified a sponsored link, but when meta-search engines just combine the ads as matter of fact part of their results then the concept of meta-search has completely jumped the shark.

Sorry Dogpile, but you've fetched your last bone for me.