Sunday, November 26, 2006

Huge Made for AdSense Scraper and Spammer Operation Unveiled

The downside of scraping the wrong webmaster is that your websites now contain breadcrumbs that let that webmaster unravel a big chunk of your network of sites that you've been scraping and spamming.

I'm not going to even go into the list of domains I found my scrapings on as it's a huge list and the specific sites I found were all hosted on theplanet.com and 800hosting.net.

Besides, if I expose the list this MFA scraper spammer might figure out how I unraveled his system and we wouldn't want that, now would we?

I'm not even going to bother with the IP they were scraping from or the user agent since it was a spoofed browser UA of course, and the IPs doing the scraping were all from the same hosting companies listed below.

Instead, let's start at the top of the iceberg with their statistics pages listing 400-500 sites per page which in total roughly links to about 6,500 individual scraper sites, and I'm sure we're just touching the surface here.

http://www.badhood.info/
http://www.browserbytes.com/
http://www.csprovisions.com/
http://www.inbounders.com/
http://www.jewelrydns.info/
http://www.landingdns.info/
http://www.link-magic.com/
http://www.link-pros.com/
http://www.multithreedns.info/
http://www.multitwodns.info/
http://www.sfte.info/
http://www.terrificdns.com/
http://www.trafficsupply.com/
http://www.virtual-domains.com/
So where do these sites host?
badhood.info 70.87.137.2 -> 2.89.5746.static.theplanet.com

browserbytes.com 74.52.26.194 -> c2.1a.344a.static.theplanet.com

csprovisions.com 74.52.29.2 -> 2.1d.344a.static.theplanet.com

inbounders.com 66.98.156.98 -> evolution.cia.sk

jewelrydns.info 69.41.183.122 -> (800hosting.net)

landingdns.info 66.98.132.73 -> ev1s-66-98-132-73.ev1servers.net

link-magic.com 64.246.60.95 -> rs-64-246-60-95.ev1.net

link-pros.com 64.246.60.50 -> ns1.s810.net

multithreedns.info 74.52.225.194 -> c2.e1.344a.static.theplanet.com

multitwodns.info 74.52.126.130 -> 82.7e.344a.static.theplanet.com

sfte.info 70.87.216.194 -> c2.d8.5746.static.theplanet.com

terrificdns.com 66.98.132.68 -> damon.screaminghost.com

trafficsupply.com 66.98.250.34 -> (Everyones Internet)

virtual-domains.com 66.98.198.44 -> ev1s-66-98-198-44.ev1servers.net

There you go, it could've been a been long spew of data but there's really nothing you need to know except BLOCK access from data centers and you'll be a bit more secure, which I've been preaching for quite some time.

Now, let's look at a specific site like fashionmenclothingjackets.info and you'll see how they really spam the search engines with 3 digit subdomains. All of their sites are like this and there are literally hundreds of thouands, if not millions, of junk pages associated with this one group of domains.

And we'll take a peek at another of these sites, like fiftiesteenagefashion.info, to see how they promote themselves with blog and forum spam for traffic.

There you have it all with scraping, search engine spam and blog and forum spam all tied up in one neat little package.

Enjoy.

P.S. Did we piss on someone's cornflakes?

Getting a ton of hits to this post via a forum on http://www.pginsider.com/ which makes you go Hmmmm.... it's amazing how they out themselves once you post something.

5 comments:

Anonymous said...

Thanks Bill. Good work.

I already had all those EV1, The Planet etc netblocks blocked at my firewall, but it is always useful to highlight to folks (and the search engines) the scale of the scraping and web-spamming problem.

It's just the tip of the iceberg, alas.

IncrediBILL said...

This is a prime example of why I send a 'soft error page' instead of a 403 or 404 so I can see where the page content lands and unravel the scapers.

tmaster said...

I have been blocking EV1 and ThePlanet for some time. The only real trafic that might get blocked is websites trying to pull your rss feeds. But they can just use FeedBurner.

IncrediBILL said...

You can also make allowances for specific IP addresses or User Agents and just block the rest.

Really not that complicated as the naysayer (probably one of the scrapers) would like to have us believe.

They love to spread the FUD (fear uncertainty and doubt) so everyone doesn't lock out there little scraping scripts.

IncrediBILL said...

People that post "get a life" need a life.

And those claiming to be better programmers than I should just step off before I kick your ass, you don't know who or what you're dealing with, but an agency with a TLA might have a clue.

Peace out.