Thursday, June 21, 2007

Javascript Cloaked Spam Pages Baffle Search Engines

Recently I ran across a large series of scraper sites that are the ultimate in openly cloaking to the search engines. The pages I see when I view the source are the same pages cached by the search engines, nothing special there so a search engine crawling outside it's IP range to check for cloaking would see the same page.

However, access those pages with javascript enabled and you are instantly redirected to a wide variety of affiliate pages. The trick is these pages all have a single embedded link to a heavily obfuscated page of javascript that redirects you to the affiliate pages.

The scraping to build these cloaked pages came from 216.75.15.26 which is in the cari.net IP range:

OrgName: California Regional Intranet, Inc.
NetRange: 216.75.0.0 - 216.75.63.255
Just goes to show you that traditional cloaking is a thing of the past as the war has escalated into obfuscated javascript. The only way I see the search engines winning this war is to actually execute that javascript and see if the resulting action was to take the visitor away from the page.

Just goes to show that people claiming here in comments recently that "Stealth crawling is necessary to keep honest webmasters honest" are out of their league and don't really know what the score is on the web as the sites aren't honest when they are in plain site, no stealth needed, they worked around it.

Wonder what they'll think up next?