Wednesday, November 05, 2008

Temporarily Block HotLinking To Find Copyright Abusers

Blocking hotlinks is usually considered a method used to conserve bandwidth and stop leeching of images off your server. However, you can also use hotlink blocking to quickly and easily find all those sites using your content.

The most common solution for Linux servers is to add the following hotlink blocking code into your .htaccess file.

RewriteEngine On
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http(s)?://(.*\.)?yourserver.com [NC]
RewriteRule \.(jpeg|jpg|gif|png)$ - [F]
Obviously you want to change yourserver.com to the domain name of your site before adding this to .htaccess on your site.

Now once you've added this code the fun begins as you sit back a few hours and wait for all the "403 forbidden" codes to start filling up your access log file.

Now using a simple grep on your log file will generate a nice list of sites in the referrer field that are hotlinking your images, or much worse which is often the case.

grep "\.jpg" access_log | grep " 403 "

grep "\.gif" access_log | grep " 403 "
The first part of the grep locates all ".jpg" files then the second part filters out all but the " 403 " forbidden errors.

After a day or 2 you'll have a nice list of sites to send C&D's, DMCAs, and all sorts of fun stuff.

Now disable your hotlink blocking script or remove it from your .htaccess file.

Why disable hotlink blocking?

Because hotlink blocking encourages people to actually download your images making the process of finding stolen images way more difficult. Therefore, a temporary hotlink block shows you everyone doing this just long enough to take corrective measures, then let your site wide open again and wait for the next batch of idiots to start hotlinking.

Hope a few of you find this little tip handy!

Monday, November 03, 2008

Pubcon '08 and Other Announcements

I'll be presenting at PubCon '08 on the topic of Competitive Intelligence. The only difference is the other panelists will be discussing how to find competitive intelligence while I'm telling people how to protect themselves from such research.

Also, keep your eye on this space:

http://twitter.com/CrawlWall

All shortly upcoming announcements will be made via Twitter and there's a bunch coming up soon.

It's what you've been waiting for...