Someone posting on Freedom to Tinker as Neo said a similar thing to Greg Yardley's post that my bot blocking endeavors are going to stop tinkerers and end innovation on the web which is patently untrue.
The only thing my bot blocker is going to do is allow any webmasters, even non-technical neophytes, to have easy access to the tools that allow them to monitor and control access to their sites that is both easy to understand and administer. No more cryptic crap. The software will show them what's accessing their site so they can make informed decisions about what should crawl or what shouldn't crawl. That's what it's all about, knowledge, as knowledge is power and gives the webmaster the upper hand.
I'm not the only one blocking everything either as Brett Tabke of WebmasterWorld blocked everything from crawling for a while just to see what was bouncing off his firewall. What Brett decided to do was just require logins from people coming from bad internet neighborhoods. Since most websites don't have logins and subscriptions, my solution was to use captchas when bad behavior happens.
Yes, I'll admit I'm on a tear and block everything under the sun but I have a real purpose in my madness which is feeding bread crumbs to the rest of the creepy crawlers hitting my site so I know who they are, where they came from and where the content appears when it's indexed by search engines.
However, I don't intend on enforcing my particular brand of blocking on everyone that decides to use my bot blocker as one size doesn't fit all. The software has lots of options that the webmaster can set, and assuming the webmaster checks his control panel now and then, shows the webmaster what new things are on the web and allows them to grant access or be denied.
I don't foresee my bot blocker causing Neo's or Yardley's apocalyptic view of the web whatsoever but I do foresee the following changes:
- New bots and people tinkering might just have to ask permission first to the network of bot blockers to get access, not a big deal and easily done.
- Sloppy bots will go away or be fixed when they get stopped doing dumb things.
- User agents will be unique per site or software, no more Java/1.5.0_03 so they can either learn how to set the UA or stay off the net.
- Good scrapers that scrape for directories, that actually provide real links to sites, will need to identify themselves or go away.
- Bad scrapers will be in serious jeopardy as the scraping noose closes.
It's just the bottom feeding scrapers and spammers that will be in serious trouble and we may see botnets emerge to do the bidding of the nastiest of the crawlers.
OOOPS!
Too late, botnets already exist and other groups are actively fighting the botnets.
So what am I missing that bot blocking technology will cause?
Oh yes, the return of MANNERS, COURTESY and RESPECT FOR COPYRIGHT which means asking permission, being OPT-IN, not just taking what you want regardless of the webmasters's wishes.
When you ask to crawl my site it's a business arrangement, you want to build a business and ask MY PERMISSION to be included in your business.
This is how it works in the real world.
If you want to do business with someone you have to ask first
It would appear that many think that respect and courtesy is something that's not part of the Internet and the entitlement to content just because it's on a PUBLIC NETWORK is flat wrong.
Walmart is technically a public place, anyone can just walk in the door, and if you walked into Walmart and do what most scrapers do on the web they would call the cops and haul your ass off to jail. Before you respond that Walmart is a private company, even the Public Library frowns on people doing what scrapers do and they have signs posted above copying machines warning you about copyright and you can only copy small quantities for personal use only.
I'm just giving webmasters the same control Walmart has:
NO SHIRT. NO SHOES. NO SERVICE.
Pretty simple.
The webmasters will be able to control their site as much as technology allows. If we get to the point that Neo suggests where every visitor has to enter a captcha before they can access any website, I suspect some legislation will possibly occur that will make crawling without permission an offense and the Australians are already working on legislation which is flawed, but they are heading in that direction.
I'm just making the tool, not telling people how to implement it.
The choice is up to the internet, webmasters and politicians how this all plays out, not me.