My first post about the emergence of Attributor was about a year ago and I thought it was time to review and see what we've learned since then.
Here's where they've crawled from that we've spotted:
184.108.40.206 "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
Proxy VIA=1.1 ind27.attributor.com:3128 FORWARD=10.50.40.74
220.127.116.11 "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
Proxy VIA=1.1 ind25.attributor.com:3128 FORWARD=10.50.40.74
18.104.22.168 "Mozilla/5.0 (compatible; dejan/1.13.2 +http://www.attributor.com)"
22.214.171.124 "Mozilla/5.0 (compatible; dejan/1.13.2 +http://www.attributor.com)"
Now the amusing part is the IP 126.96.36.199 as it's a proxy and it appears they aren't any smarter than the rest of the bots as 340+ crawls have come via that IP this year including msnbot, Googlebot, Twiceler, Gigabot, Snapbot and some others so you're in fine company with other stupid crawlers out there.
What's curious is that 188.8.131.52 is one of Gigablast's IPs, and some of the others may be as well but they resolve to Level 3 blocks as do other Gigablast IPs, but I didn't look hard enough to confirm, lazy I guess.
Now let's examine one of my favorite statements on their website:
...you will no longer have to hold back top content or impose technical barriers on its viewing; instead, quality content can be made more easily available to a larger number of consumers.Excuse me?
My technical barriers [not used on this blog] stop the problem in the first place just so I don't need to pay anyone to go chasing my content around the billions of pages on the web. As a matter of fact, my technical barriers are what trapped your crawl attempts above and identified what IP's your bots were using. That means your technology can't get past my technology so you'll never know if I'm stealing anyone's material but I'm pretty sure you aren't stealing my bandwidth finding out.
So now you have to ask yourself which method is easiest to stop content theft, blocking data centers and bulk downloaders on the fly or scanning billions of web pages looking for theft after the cows have already left the barn?
Bot blocking wins hands down as it's more cost effective without a doubt.
The best part is if someone wants to license your content you'll get 100% of the profits and not share with some company that wants to chase around the vast wasteland of the web looking for violators.
Maybe Attributor has some other places they crawl from without the user agent identifying the source, but that just means the bot blocker will stop and quarantine some anonymous IP address and we may never know it's really them.
Doesn't matter, I'm still banking on proactive content theft prevention technology and not reactive technology as it's easier to keep your cows at home when the fences are all closed and patrolled in the first place than try to round 'em up later.