Saturday, September 23, 2006

Search Engine Spammers Extraordinaire

OK, these idiots made the classic mistake of scraping one of MY pages so they're about to get outted in a massive way. Unfortunately, in this case I didn't get an IP address and my content was already missing from their site thanks to the slow crawl and index of MSN, but a little research proved this was a HUGE operation of mind blowing proportions.

I got bored checking all the domains as some are hosted in the same place, some aren't, too many to look at but it's all spam. Perhaps the same person, or perhaps a bunch of idiots running some automatic website generating tools.

The sites tend to come in 3 flavors, AdSense monetized articles, AdSense monetized scrapers sites (scroll WAY down) and AdSense + Shareasale sites.

Just search for the phrase "When we had a difficult think about this project" in Google, Yahoo or MSN and you'll see a shitload of pages from these search engine spammers.

Also, try a search for the phrase "Foraging for the best file on" in Google, Yahoo and MSN and see more shitloads of pages.

You can see all sorts of key phrases these sites repeat and bust more and more of them like this "Everyones path is incomparable and everyone" one on Google or Yahoo.

And even more shit like "If you've worked with a portal" on Yahoo.

Someone noticed their terms were hijacked in these bullshit pages and blogged about their suspicion on what's going on.

Seriously though, I bet I could write a script to identify and locate all the bullshit spammers using this data with all their common phrases as it's so easy to spot once you have a data sample like these to analyze.

Spam, spam, fucking spam, and not so smart fucking spammers.

2 comments:

Anonymous said...

The big question is: why isn't Google/MSN etc blocking this stuff. It ain't hard to find.

I presume you have reported a few of these webspam instances to Google via their Webspam form so they can get to work.

It is easy to build up a reasonable map of sites connected to these webspam networks, via whois and other tools. Team Google should have no trouble.

IMHO, some of the domain registrars involved also are at fault here. They should shut these spam domains down, because the domain contact information is clearly bogus for most of them.

nanyo said...

Hi,

I find this greate blog, some days ago. Looking your post, i find many usefully news :)


I manage a site (under costruction but working) where there are many bots,spam,unknow spam and many ip (http://nanyo.titanchip.com) .

In the past week my forum are under a massive spam attack (viagra,cialdis,muscle,etc).

The strategy is always the same... they make a new user with icq,msn,web account linked to some other site. If the account is enabled by email, they will go in the forum and make much spam with the same arguments or a little message like "nice", "good work". In this way they propagate the link of its site. If the account is enabled manually by admin, the new user il always visibile and the spider can capture it.

Searching in my logs i find the most spam address are:

88.212.31.146
203.113.13.3
203.113.13.4
203.113.13.5
209.11.51.35
211.216.247.51
212.45.14.11

Now they are in my .htaccess :P

ps:sorry for my poor english.