Saturday, September 30, 2006

ShoeMoney's Blog Spam Stopping Primer

The day after my battle cry to Rally the Anti-Spammers here comes ShoeMoney with some great suggestions for stopping blog spam. Everything ShoeMoney posted is very solid advice but some spammers have already been evolving past some of those patches which is why I use my draconian anti-spam methods. Basically, ShoeMoney's advice will stop the majority of your garden variety spammers, but not all as they are constantly adapting, so as you improve your defenses they improve their ability to bypass those defenses.

Remember, security is built in layers and the more layers you pile on, the more the spammers will chip away at your security so building the better spamtrap just results in smarter spammers and they're already here which I'll address with examples below.

Let's examine ShoeMoney's anti-spam advice, see what some state of the art spammers are already doing, and add a few more tricks here and there for even better security.

Starting with the first item he listed:
5) Deny Access to No Referrer Requests

The approach does work on most spammers but I had about 10 requests today where it would've failed. Not that you shouldn't implement this, it's a good trick to stop a lot of spam, just be aware it won't stop everything.

Example:

My bounced spam log shows the following:

IP: 84.110.248.226
User Agent: "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
Subject: "Viagra"
URL: http://anol.webhosting.gs/viagrageneric.html#viagra
Take a look at what's in my server log:
84.110.248.226 - "POST /formsubmit.html HTTP/1.0" 200 11918 "http://www.mysite.com/formsubmit.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
Yup, that's right, a referrer, and I had about 10 of those and they were all from spambots.

Stopping the poorly coded spambots is easy, but they won't be vulnerable for long as the patch to add the domain name being spammed into the referrer is trivial so I expect this anti-spam advantage to be short-lived but I use it too, you should still do this.

Now, let's tackle the next item, which is VERY good advice:
4) Kill tor anonymous proxies

I block many proxies on my servers, which does stop a lot of spam, but don't think that all spammers use known proxies. This is the reason I also block dedicated server hosting facilities because a series of $2 webhosting accounts can be used to effectively spam and bypass the proxy lists.

Example of 4 sample spams (out of many) today that all had referrers mentioned above and came from some ISP/Host called bezeqint.net:
09/29/2006 84.110.248.226
"Viagra" http://anol.webhosting.gs/viagrageneric.html#viagra

09/29/2006 84.110.244.240
"Viagra" http://gerda.forospace.com/#viagra

09/29/2006 84.110.243.107
"Cialis" http://borea.forospace.com/#cialis

09/29/2006 84.110.241.163
"Cialis" http://kaizer.webhosting.gs/cialisbuy.html#cialis
Use this with caution:
2) Blacklist Repeat Offenders:

First off, blacklist on the FIRST offense so there is no second time. However, you really need to know what you're doing and lookup who the IP address belongs to so you aren't blocking IP addresses from places like the AOL IP pool (reused every 15 minutes or so) or any other shared proxy dial-up IP pools as those IP assignments are very temporary and the next access is probably a different visitor, not a spammer, so be very careful with this.

This is a gem and we can make it better:
1) Rename your comment file

Excellent advice as I've done that on some websites but don't be shocked when it's short-lived as spammers also have crawlers looking for these comment pages and the fact that you're still linking it under the keyword "comments" is a dead giveaway.

If you're going to change the file name, also change the word that links to the file name to "discussion", "verbal intercourse", or "rants", anything but "comments" to throw them off.

Additionally, move the actual FORM into obfuscated javascript document writes. How this works is the spambot scanning your website can't even find the webform to submit comments as most bots don't use javascript, so only an actual visitor would see an actual webform written into the web page via javascript.

Don't forget the CAPTCHA!

Now, the one thing ShoeMoney didn't mention which works wonders is a simple CAPTCHA and it's keeping a few of my sites spam free without ANY other work involved. Yes, there are ways to bypass a captcha but it's not easy for the spammer. So far most captcha protected sites are safe with such simple protection, but I expect that situation to escalate soon.

Kudos to ShoeMoney for spreading the word, we need more anti-spam information spreading and more people jumping on the anti-spam bandwagon so we can rid the 'net of this scourge as soon as possible and move on to more productive activity.

Thursday, September 28, 2006

Virginia Tech's Computer Science: Wiki Spam 101

My website stops spam posts cold, and logs them, so that eventually I can glance over the list of bounced spams now and then just to see what was caught and this one was priceless:

09/28/2006
200.88.223.98
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
Subject: "Viagra"
URL: http://research.cs.vt.edu/advance/tiki/
tiki-directory_redirect.php?siteId=3284#viagra

I looked and thought, "Viagra spam linking to VT.EDU? Could their server be hacked like SpamHuntress is posting about?" So I click the link and of course it uses VT.EDU's server to redirect me to some viagra sales site just like the URL would make you think it would, no surpise there.

So I trimmed the URL to see what in the heck this site was and it's ADVANCE, FOR THE ADANCEMENT OF WOMEN IN ACADEMIC SCIENCE AND ENGINEERING CAREERS and it's full of advances for MEN such as viagra, cialis and levitra spam plus a whole bunch more.

Well, the IT dept. and professors in charge of the VT computer science program should probably start quaking in their boots as I would be VERY UNHAPPY if I was the Dean.

This is completely unacceptable when the IT guys and CS Profs aren't using even rudimentary anti-spam technology like, oh, maybe a simple CAPTCHA to stop this shit.

I want my tuition refunded.

BTW, whoever these spammers are, they've been VERY BUSY little beavers.

Time to Rally the Anti-Spammers

After the demise of Blue Security and this recent meaningless default judgement against SpamHaus, the spammers are getting braver and bolder by the day. Now, one of the most vocal anti-spammers around, SpamHuntress, has recently come under attack after exposing a few people that really didn't want to be exposed.

Even one self-professed blackhat SEO web spammer has the audacity to tell SpamHuntress to "get a life" because she must be cutting into his livelihood. Maybe I'm just too lazy, but who would've ever thought of registering for a bunch of forums and never posting as an SEO tactic? Using his DISY registration spamming script probably sped it up and he's busy making friends [scroll to bottom] as well.

OK, so now the phpBB people will need to be alerted to add NOFOLLOW to all those links in the registration page to stop this SEO vulnerability, but I digress, will rant about that later.

Unlike email spam, which is a real pain in the ass to stop, there is absolutely no reason we have blog, forum or guestbook spam whatsoever except for shitty programmers writing the stuff and people using it that either:

  • have abandoned their websites or forgotten that old guestbook or blog now littered with junk
  • aren't aware there is a problem as many spambots post on older threads
  • don't know there are solutions to these problems
  • aren't capable of installing the patches even if they are aware of the solutions
I've posted before how I stop spam on all my pages that have forms for submitting a variety of things, WITHOUT the use of captcha's, and although it's a pretty draconian approach to the problem it's also highly effective. My solution was to simply reject any posts with embedded HTML and URLs, just bounce them with an error about the content, and it works 100% against real spam. Maybe it's a tad extreme but when this type of spam is dead maybe I'll open up my sites again to more robust content posts, you never know.

However, for those that like to continue to do things the hard way, here's a list of software you can install to stop the spammers:
I'm going to ask that people reading here help the cause and start educating everyone you run across with a blog or forum being overrun by spam.

Please point them to a resource to solve the problem or offer to help them add the plug-ins pro-bono or for a nominal fee if they don't understand how, or if all else fails alert the host to help sites overflowing with spam and see if they'll be of any assistance.

Don't forget, the purpose of these spammers is to drive direct traffic and also get results in Google so when you stumble upon these sites in Google, make sure you file a Google Spam Report while you're there to get them whacked from the search results.

We can stop this in the next year or two, as long as people quit being complacent and just install the upgrades, patches, captchas and other anti-spam tools.

Spread the word, let's just get this done so we can stop talking about it already!

Wednesday, September 27, 2006

MySpace: Porn Networking Spam Machine?

The other day I signed up for MySpace while researching the members with "Click The Ads" on their pages encouraging others to commit click fraud to fund their various lame causes.

Unfortunately, signing up for MySpace immediately resulted in a couple of porn spams sent to my Inbox which really pissed me off.

So I get some shit that looks like this:

FROM: MySpace Events
SUBJECT: .. has invited you to: I seen you online

Hi ,

.. has invited you to an event on MySpace:

Click the link below to view the event details:
http://events.myspace.com/index.cfm?fuseaction=PORNSPAM

Now below this, there is some bullshit message from MySpace:
At MySpace we care about your privacy. We have sent you this notification to facilitate your use as a member of the MySpace.com service. If you don't want to receive emails like this to your external email account in the future, change your Account Settings to "Do not send me notification emails."
Really, you care so much about my privacy you let goddamn porn spammers send me fucking email?

I'm touched, a tear comes to my eye ...

... yes a tear, because I realize can't reach out and smack whoever let this shit happen upside the head!

Anyway, here's the website on MySpace linked from the spam:



Here's the first site's link to a girl with a webcam:




And here's the second spam site's girl with a webcam:





I'm wondering if people under 18 get these spams too?

I think I'll just cancel the account because MySpace is no place I was to be associated with.

Monday, September 25, 2006

MySpace: A Click Fraud Social Network?

Maybe it's just Web 2.0, or Web Welfare 2.0, but it appears that stealing from advertisers is now something that is accepted in social networks. Let's look at what we find on sites like MySpace and others which are a good place to build up a nice list of friends to click your ads, especially the Google ads, because we all know that friends click friends ads, especially if you want your friends banned from AdSense.

Even on YouTube where people can't put up their own ads they beg people to come to their website and click the ads to support them putting up more videos!



The most shocking is Blogger, which is owned by Google, the creator of AdWords and AdSense, which hosts sites that encourage people to "Click the Ads" to defraud the very advertisers they rely on for their massive income.

How difficult would it be to have a single employee out of the entire Googleplex devoted just to keeping click fraud off their own property?

You know the answer, I know the answer, yet a simple search reveals that it's not being done, or not done adequately at any rate or there would be no sites returning results from Blogger on this topic if they were on top of the problem.

The technology for these sites to deploy an automated process to locate pages within their sites that contain calls to "click the ads" or "click Google ads" or any combination and eliminate this fraud on a daily basis is so trivial and rudimentary that beginning programmers could do it.

Bottom line is there's absolutely no excuse for this type of call for advertiser click fraud to be allowed unchecked on these sites, not in MySpace, YouTube, Blogger, Google, Yahoo, MSN or anywhere else and why Click Fraud 2.0 continues to perpetuate on the web when it's so easy to thwart frankly boggles the mind.

Flickr Member Requests Click Frauding Advertisers for the Children

Well, I've seen all sorts of excuses to advocate click fraud but the plea on Flickr to commit a crime for the children is a new one and more despicable than any I've seen before. Think about the precedent that this sets in impressionable young minds that it's "OK TO STEAL FOR A CAUSE" when crime is never OK. Sadly, all of the good this person has possibly done for these children was wiped away with one call to arms to defraud people for a cause.

If you want to save the children, set up a Paypal account and teach the children than they can be helped by the generosity of others, not by others commiting FRAUD!

Here's the screen shot from Flicker:



And the site it lands on in Blogger:



Come on buddy, just ask for donations and keep it legal as we all love the children but this is over the top.

Saturday, September 23, 2006

Search Engine Spammers Extraordinaire

OK, these idiots made the classic mistake of scraping one of MY pages so they're about to get outted in a massive way. Unfortunately, in this case I didn't get an IP address and my content was already missing from their site thanks to the slow crawl and index of MSN, but a little research proved this was a HUGE operation of mind blowing proportions.

I got bored checking all the domains as some are hosted in the same place, some aren't, too many to look at but it's all spam. Perhaps the same person, or perhaps a bunch of idiots running some automatic website generating tools.

The sites tend to come in 3 flavors, AdSense monetized articles, AdSense monetized scrapers sites (scroll WAY down) and AdSense + Shareasale sites.

Just search for the phrase "When we had a difficult think about this project" in Google, Yahoo or MSN and you'll see a shitload of pages from these search engine spammers.

Also, try a search for the phrase "Foraging for the best file on" in Google, Yahoo and MSN and see more shitloads of pages.

You can see all sorts of key phrases these sites repeat and bust more and more of them like this "Everyones path is incomparable and everyone" one on Google or Yahoo.

And even more shit like "If you've worked with a portal" on Yahoo.

Someone noticed their terms were hijacked in these bullshit pages and blogged about their suspicion on what's going on.

Seriously though, I bet I could write a script to identify and locate all the bullshit spammers using this data with all their common phrases as it's so easy to spot once you have a data sample like these to analyze.

Spam, spam, fucking spam, and not so smart fucking spammers.

Whitelist OPT-IN htaccess file

People are always asking me how to build an OPT-IN .htaccess file, which I advocate, opposed to the traditional blacklist methods.

The problem with OPT-IN is it's VERY unforgiving and you really need to check your visitor stats and make sure you're letting in all the crawlers that are sending you traffic.

Belows is a bare bones sample of how it works and anything not in the list gets a 403 Forbidden error so you'll probably need to add more items and refine this for your particular website.

Sample .htaccess file for Apache 2.0:

#allow just search engines we like, we're OPT-IN only

#a catch-all for Google
BrowserMatchNoCase Googlebot good_pass
BrowserMatchNoCase Mediapartners-Google good_pass

#a couple for Yahoo
BrowserMatchNoCase Slurp good_pass
BrowserMatchNoCase Yahoo-MMCrawler good_pass

#looks like all MSN starts with MSN or Sand
BrowserMatchNoCase ^msnbot good_pass
BrowserMatchNoCase SandCrawler good_pass

#don't forget ASK/Teoma
BrowserMatchNoCase Teoma good_pass
BrowserMatchNoCase Jeeves good_pass

#allow Firefox, MSIE, Opera etc., will punt Lynx, cell phones and PDAs, don't care
BrowserMatchNoCase ^Mozilla good_pass
BrowserMatchNoCase ^Opera good_pass

#Let just the good guys in, punt everyone else to the curb
#which includes blank user agents as well

<Limit GET POST PUT HEAD>
order deny,allow
deny from all
allow from env=good_pass
</Limit>

Just save the above as a file named ".htaccess" in your httpdocs or root web folder in your hosting account and all the crazy bots abusing your site will get bounced from now on.

Remember, anything not listed will no longer have access so be careful and make sure everything your site needs allowed is in the list.

Enjoy.

Googlebot Validation

Google has finally completed a DNS project that will allow us to use a simple reverse and forward DNS check to verify it's really, truly, honestly Googlebot and not a cheap immitation, or Google crawling thru a proxy, or anything else you can imagine.

I'm so sick of explaining why you might need this and what it solves you'll just have to follow a few links and read the threads at these various places.

Here's the official How To Verify Googlebot post on Google's blog.

Then you can check out what's been said about How To Verify Googlebot on Matt's blog.

Then a couple of threads on WMW about Verifying Googlebot that should answer any other questions on this topic.

Thanks again to Matt for getting this project finished!

Tuesday, September 19, 2006

How Important Are Plurals

Many people ignore plurals when they optimize their website and miss a lot of opportunity for additional search engine traffic.

Here's a few trend examples:

Take a look at plumbing, plumber and plumbers and you'll note that the plural is just as often the search term as the singular plumber.

How about teaching, teacher and teachers where all 3 run very close and teachers appears to dominate the search trend by a thin margin.

Last but not least, something closer to home with blogger and bloggers, where blogger clearly stands out as the dominate term but bloggers is statistically significant enough to merit ranking for the plural.

So don't forget to rank for your keyword plurals or someone else will rank there instead of you and they will KICK YOUR S!

Request from India

Just when I thought it was going to be a boring day I got a link-exchange spam from one of those wonderful Indian SEO's that wouldn't know how to promote a website to save his own life.

I'm actually shocked this email didn't include the usual threat that "you have 24 hours to confirm a reciprocal link before we remove yours from our site".

Boy, doesn't this shit look familiar:

Dear Webmaster
Greetings from India

Happened to visit your Webpages : [FILL IN BLANK OF SPAM RECIPIENT HERE] & liked it very much.
Would like to request you to have a look on our site :: [FILL IN BLANK OF SITE BEING SPAMMED HERE]

Hope you'll like this site. We are trying our best to spam the shit out of everyone in the name of India, You can help us by just adding our link on your wonderful website. And these exchanging link with good quality websites is beneficial for both the site to get a good ranking in search engines & that will help both of us in driving Traffic.

So We request you to add our link at your Website

Here is the Link Information of our Sites ::

URL : [LINK TO OFF TOPIC SHIT GOES HERE]
Link Text : We Spamma U Ass
Desc. : That's Right, This is Spam, its no more a dream!!

Just do let us know if this acceptable for you.
Hope to have quick & positive response.
Thanks in Advance

Best Regards
Sendjay Sumspam
Spamming-Our-Ass-Off.com

BTW, if you're the Indian fuckhead sending this shit, FUCK NO I WON'T LINK TO YOUR SITE you goddamn moron.

Just a lovely way to start the day.

Say it with spam.

Thursday, September 07, 2006

Counting Scrapers on your Abacus

Had a couple of persistent little fuckers hosting with Abacus that just keep trying and trying to download a boatload of pages that I've been monitoring for months now.

The specific IPs of these boxes are:

206.225.82.155 "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"

206.225.91.164 "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)"

206.225.83.179 "Evaal/0.7.2 (Evaal search engine; http://evaal.coml; bot@evaal.com)"

216.55.161.38 "Java/1.4.1_04"

216.55.142.118 "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"

216.55.162.3 "PEAR HTTP_Request class ( http://pear.php.net/ )"

216.55.147.80 "sna-0.0.1 mikeelliott@hotmail.com"
Toss in a couple of proxies:
206.225.85.127
206.225.86.86
And some other miscellaneous bullshit not worth mentioning.

Here's what to block:
OrgName: Abacus America Inc.
OrgID: ABAC
NetRange: 206.225.80.0 - 206.225.95.255

OrgName: Abacus America Inc.
OrgID: ABAC
NetRange: 216.55.128.0 - 216.55.191.255
Now you've been COMPLETELY BLOCKED so count THAT on your Abacus!

More Evolving Scrapers

Like I've been reporting, they're all going stealth.

I keep seeing user agent change from this:

62.163.33.234 "Java/1.4.1_04"
To this:
62.163.33.234 "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"
Soon the usual blocking methods won't work whatsoever.

Wake up and smell the COPY before it's too late!

Block the Bots Tonight

Time for a little lunacy break for people feeling blue battling the bad bots.

Sing along boys and girls...

Sung to the tune of "Rock Around the Clock"
with apologies to Bill Haley and the Comets.

One, two, three bots, four bots, blocked.
Five, six, seven bots, eight bots, blocked,
Nine, ten, eleven bots, twelve bots, blocked,
We're gonna block all the bots tonight.

Put your firewall on and lock em out,
We'll have some fun when they scream and shout,
We're gonna block all the bots tonight,
We're gonna block, block, block, their scraping blight.
We're gonna block, gonna block, all the bots tonight.

When the block strikes two, three and four,
If the scrapers slow down we'll yell for more,
We're gonna block all the bots tonight,
We're gonna block, block, block, their scraping blight.
We're gonna block, gonna block, all the bots tonight.

When the server dings five, six and seven,
We'll be right in bot blocker heaven.
We're gonna block all the bots tonight,
We're gonna block, block, block, their scraping blight.
We're gonna block, gonna block, all the bots tonight.

When it's eight, nine, ten, eleven too,
I'll be blocking bots and so will you.
We're gonna block all the bots tonight,
We're gonna block, block, block, their scraping blight.
We're gonna block, gonna block, all the bots tonight.

When the counts hit twelve, we'll laugh and yell,
As a dozen bad bots have just went to hell!
We're gonna block all the bots tonight,
We're gonna block, block, block, their scraping blight.
We're gonna block, gonna block, all the bots tonight.

University of Toronto Goes Bat Shit for VPI

Something coming from the University of Toronto keeps making periodic pitstops at my server and only request _vpi.xml like I give a shit about this file.

142.150.4.114 [kahuna.erin.utoronto.ca.] "Firebat 2.5.22" "/_vpi.xml"
Looks like a bunch of bullshit to me as I tried to weed through the ramblings about Jabber groupchat protocol since I've never had anything remotely related on my server whichs brings up the million dollar questions, why is this little fucker looking for it?

Dunno what the motives are but they didn't get far, back to class asshole.

Tuesday, September 05, 2006

Firefox Memory Leaks

Leaving Firefox 1.5 up and running too long without ever closing it for days always seems to eventually cause issues like the swap drive running non-stop or something.

Anyway, I decided to keep the Windows Task Manager up and running all the time so I can monitor Firefox performance and it appears there are some serious memory leaks and issues with closed or stopped downloads that may not be stopping the thread reading the data in the background.

A couple of easily reproduced problems involves stopping a very large page downloading, we're talking thousands and thousands of lines of text, but it appears to keep loading into memory even after it's no longer visible, pushing the memory footprint up to 200MB+ with only a couple of tabs open.

Sure hope they do some better testing on the 2.0 code as I may switch back the IE 7 if it's substantially better as one thing Microsoft does know how to do is keep their code from leaking memory and not leaving zombie threads running in the background.

Sunday, September 03, 2006

Scrapers4U.de

Today I noticed another hit from this same server farm in Germany with something pretending to be a Windows browser:

62.75.218.82 [elbe016.server4you.de.] requested 16 pages as "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x4.90)"
So I checked my archives and sure enough it's been here a time or two before attempting to get inside and there was some hit's from other assocated IP's in their range.

Who hosts this mess appears to be intergenia.de:
netnum: 62.75.128.0 - 62.75.255.255
org: ORG-iGCK1-RIPE
netname: DE-INTERGENIA-20010727
descr: intergenia AG
Which also owns plusserver.de, server4you.de, server4you.com, netfabrik.de, and some end user services who's IP's may be a part of intergenia.de's range, no clue.

The plusserver.de, server4you.de and netfabrik.de both appear to use this range:
inetnum: 217.172.167.0 - 217.172.169.255
netname: PLUSSERVER-1
descr: PlusServer - Dedicated Premium Serverhosting
descr: http://www.plusserver.de
The server4you.com seems to have this block:
OrgName: Server4You Inc.
NetRange: 69.64.32.0 - 69.64.63.255
Comment: http://www.server4you.com
Which means the crawler that started this search still can't be pinned down to a specific hosting block for server4you other than the reverse DNS claims it's server4you.de. I poked around doing a few nslookups in that range and they return either return static-ip-62-75-*-*.inaddr.intergenia.de or someserver.server4you.de so I'm a little hesitant just to block the whole intergenia.de range.

So it looks like I'll block the obvious hosting ranges by IP and server4you.de by reverse DNS for now.

Bots from ServerDeli at Mediopia

Something came crawling from ServerDeli hosted at Mediopia, and it was the typical bot with an invalid user agent if you notice the space between "compatible" and ";" and nevers asks for robots.txt, just pages.

Here's the crawler info:

209.125.47.35 [win1.serverdeli.com.] requested 26 pages as "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"
Sorry, but my site isn't a deli snack for whatever bullshit you're running.

It always always gives me pause when you see a webhosting company using HOTMAIL addresses for their contact information:
OrgName: MEDIOPIA TECHNOLOGIES (IMA'D W/ 69998
OrgID: MTIW6
Address: 9507 34TH AVE
City: JACKSON HEIGHTS
StateProv: NY
PostalCode: 11372
Country: US

NetRange: 209.125.47.0 - 209.125.47.255
CIDR: 209.125.47.0/24
NetName: ATWORK-65024-55156
NetHandle: NET-209-125-47-0-1
Parent: NET-209-125-0-0-1
NetType: Reassigned
Comment:
RegDate: 2005-05-02
Updated: 2005-05-02

OrgTechHandle: ACH48-ARIN
OrgTechName: CHICO, ALFREDO
OrgTechPhone: +1-718-476-0313
OrgTechEmail: MYMEDIOPIA@hotmail.com
So it looks like blocking 209.125.47.* wouldn't hurt anything.

Core-Project Hijacks an IP

Saw these idiots again today looking for FrontPage on my server:

207.226.161.69 - "POST /_vti_bin/_vti_aut/author.dll HTTP/1.1" 404 1176 "-" "core-project/1.0"
207.226.161.69 -"HEAD / HTTP/1.0" 200 - "-" "-"
207.226.161.69 - "POST /_vti_bin/_vti_aut/author.dll HTTP/1.1" 404 1176 "-" "core-project/1.0"
The IP appears to be dedicated to a single customer hosted on Rackco.com:
cigar-review.com
cigarreview.com
Sadly, Rackco has shared and dedicated hosting so I was unable to easily pin down if this was a compromised server or some little script monkey running in a different account on a shared server.

I guess the only thing I'm amused with is how would some random script in shared hosting, if that is indeed the case, crawl out using a different IP than the server default.

Traceroute have a few clues:
ge6-14.colo02.ash01.pccwbtn.net (206.223.115.48)
ge13-1.br01.ash01.pccwbtn.net (63.218.44.125)
209-8-237-222.rackco.net (209.8.161.222)
mike.rackco.com (209.8.238.194)
cigar-review.com (207.226.161.69)
Still nothing pointing out more than one IP to block.

Ah well, either way, can't seem to narrow down the IP range assigned to Rackco because rwhois.cais.net isn't responding and ARIN just shows the major block assigned to PCCW formerly "Beyond The Network America, Inc.".

Wel'll keep an eye on this one.

Monday, August 28, 2006

Google Utilized in Phishing Exploits

Maybe the title is a little bit of link bait but it's also accurate as I received a WellsFargo phishing email today with a redirect link through Google.

Some of you may remember how I've complained a time or two about being abused via various Google proxy servers and sure enough they have something else that's vulnerable to being used by abusers.

The link to the phishing site used Google to redirect victims:

http://www.google.com/url?sa=t&ct=res&cd=7
&url=http%3Awebtracpro.valleyvistamortgage.com/wellsfargo/Update.html
How's that for Google's war on anti-phishing?

Yes, I know that's a cheap shot but they really need to fix some vulnerabilities over there and maybe after enough cheap shots someone will pay attention, who knows.

Onward with our phishing expedition!

Here's a screenshot of the email sent by the Wells Fargo "Safehaebor Department" which is amusing that they didn't even bother spell checking their phish but most people are illiterate and wouldn't notice such details.



Here's a screenshot of the actual "Update Sistem" (typo in the title) phishing page itself on the compromised server:



And the form sends the data to some place in The Czech Republic:
http://mailform.cz/
The only amazing part is that I notified the people with the compromised server a couple of hours ago and the phish site is still live as I write this, supposedly after their IT dept. was going to handle it ASAP.

So there you have it, another exciting episode of Gone Phishing.

Until next time...

Thursday, August 24, 2006

Inhoster Spammer Hits My Unprotected Contact Form

To allow visitors to let me know that my bot blocker MIGHT be making a mistake, which has happened now and then as it evolved, I had to leave one email contact form unprotected and wide open to potential bot abuse.

This has never been a problem for a long time and suddenly some jerk hosted on Inhoster started fucking with me which has actually been quite interesting.

85.255.117.253 [85.255.117.253-xbox.dedi.inhoster.com]
"User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
Of course my page requires a POST method and isn't abused by the simple GETs, and for my own reasons I didn't think a CAPTCHA was appropriate on this page as I wanted feedback without making it too hard for people.

I was breaking my own anti-spam rules on this page just because I didn't want to reject any legit posts by accident as I was trying to collect all the information I could, but now I'm implementing a few of the filters.

This first thing I did after the spambot started messing with the form was to simply start rejecting all posts with specific HTML tags. To further filter the spam, I'm rejecting any post that is nothing more than a pile of links as they were dumping a bunch of links per post, but still allowing people to send me a link or two as long as it falls within my framework of what legit content looks like.

This seems to be bouncing them at the moment and I'm not sure what the purpose would be for them to continue to spam my form if I don't allow them to dump links, but we'll see what happens.

One added benefit discovered when I was testing was it even bounced a couple of those spammy "link request" emails because they have too many links in them.

Sweet.

Try the javascript trick...

A really cute trick to play on spammers is to make the form submit activate javascript that includes additional data fields that wouldn't be submitted unless they run the javascript as another way to verify human vs. bot without using a CAPTCHA.

The only drawback to this trick, which is inconsequential IMO, is that the Google and Yahoo translation proxies bust this all to hell as they replace all of your links with links back to their translation proxy, which of course doesn't send the data through the proxy properly.

SCRAPER BUSTED #3 - UPDATE Cloaker Surfaces on Netfirms

The same cloaking bullshit artist I wrote about before has surfaced on Netfirms server.

Details:

IP Address: 80.77.80.103
User Agent: "" [blank]
Where scraping content and redirect appear:
rbmusicartist.netfirms.com/artistic-family-portrait.html
Which redirects to some Ukranian or Russian bullshit artist's site:
Domain Name: DEVAMATRI.COM

Registrant:
Oleg Povaljaev
Oleg Povaljaev (anandasat@narod.ru)
Tereshkovoj
Odessa
null,65072
UA
Tel. +380.482648166
Guess what?

They host it on ThePlanet.com, you could knock me over with a feather, I'm so surprised.

DEVAMATRI.COM (70.87.136.118)
OrgName: ThePlanet.com Internet Services, Inc.
OrgID: TPCM
Address: 1333 North Stemmons Freeway
Address: Suite 110
City: Dallas
StateProv: TX
PostalCode: 75207
Country: US
NetRange: 70.84.0.0 - 70.87.255.255
Guess we should drop Netfirms in our blocked list too just to be safe:

rbmusicartist.netfirms.com (64.34.66.18)
Netfirms Inc PEER1-NETFIRMS-02 (NET-64-34-66-0-1)
64.34.66.0 - 64.34.66.255
Well, it's not much, but a little blocking each day will keep the scrapers away.

Now, here comes the real fun...

I was curious what else was on the server with DEVAMATRI.COM (70.87.136.118) and found a shitload of cloaking spam sites:
derrdek1234.info
devamatri.com
fred00med.info
fredodermok2.info
goramon.com
greddertrniko.info
koljazzza.info
nikkasder4ee.info
nikkrongz.info
niko0lwerty.info
nikolannsw12.info
nikolansedd.info
nikolas1qqq4.info
nikolas1qwe.info
nikolazqwii.info
nikolfdsaz.info
ringvvv.info
vvvorgs.org
vwwvcom.info
wvvver54.info
xkoljazzzao.info
Note: The sites are indexed in both Yahoo and MSN but they aren't in Google.

Probably not the last of the sites from this slimeball, most likely the tip of the iceberg, but it's definitely a start to unearthing his network of crap.

SCRAPER BUSTED #11- Inhoster Scraper Indexed by Yahoo

Couple of weeks back I posted about blocking Inhoster which was oozing with spambots with one scraper in their midst and that scraper has finally surfaced.

The scraper's ID is:

IP Address: 85.255.116.178
User Agent: Snoopy v1.2
Which showed up on a page buried on this domain:
index-se.com (85.255.116.182)
What a concept, 2 IPs in Inhoster for one scraper.

Now let's dig for some dirt!

A reverse-IP lookup reveals the scraping IP address 85.255.116.178 is also the IP for FINDALLBEST.COM which looks just like index-se.com.

85.255.116.178: FINDALLBEST.COM
Domain Name: FINDALLBEST.COM
Registrant:
N/A
Nekto (nekto@utopia.com)
Jamaica 17
Cuba
null,12476
CU
Tel. +543.56576767
The info for index-se.com claims to be from the US:
Domain Name: INDEX-SE.COM
Registrant:
Index SE
Index SE (admin@index-se.com)
67 Mt. Auburn St.
Cambridge
,02138
US
Tel. +617.4959659
85.255.116.182: SEARCHADULTSEX.COM:
Domain Name: SEARCHADULTSEX.COM
Registrant:
N/A
Nekto (nekto@utopia.com)
Jamaica 17
Cuba
null,12476
CU
Tel. +543.56576767
So I got curious what else was between 85.255.116.178 - 182 and it was all the same crap:

85.255.116.179: right-pharmacy.com

Different registrant but domain redirects to buy-soma-online.findallbest.com, there's a shock:
Registrant:
N/A
Alexei Aniskevich (alex@coolsearch.biz)
Sopruse pst 15
Tallinn
Harjumsa,50707
EE
Tel. +372.715713
85.255.116.180: wagemax.com

This one is just a Plesk domain placeholder page at this time and another registrant.
Domain Name: WAGEMAX.COM
Registrant:
N/A
Alexei Aniskevich (alex@coolsearch.biz)
Sopruse pst 15
Tallinn
Harjumsa,50707
EE
Tel. +372.715713
85.255.116.180: search-paga.com

Yes, same registrant and site looks like all the rest of the crap.
Domain Name: SEARCH-PAGA.COM
Registrant:
N/A
Alexei Aniskevich (alex@coolsearch.biz)
Sopruse pst 15
Tallinn
Harjumsa,50707
EE
Tel. +372.715713
85.255.116.181: coolsearch.biz

Pay dirt! We found the domain linked to the other domains on 85.255.116.180
Domain Name: COOLSEARCH.BIZ
Domain ID: D6614592-BIZ
Sponsoring Registrar: ESTDOMAINS INC
Sponsoring Registrar IANA ID: 832
Domain Status: ok
Registrant ID: DI_2271261
Registrant Name: Alexei Aniskevich
Registrant Organization: N/A
Registrant Address1: Moisavahe 64-1
Registrant City: Tartu
Registrant State/Province: Tartumsa
Registrant Postal Code: 50707
Registrant Country: Estonia
Registrant Country Code: EE
Registrant Phone Number: +372.715713
Registrant Email: alex@coolsearch.biz
When you go to coolsearch.biz it automatically takes you to: www.gigasearch.biz
Domain Name: GIGASEARCH.BIZ
Domain ID: D7182275-BIZ
Sponsoring Registrar: ESTDOMAINS INC
Sponsoring Registrar IANA ID: 832
Domain Status: clientTransferProhibited
Registrant ID: DI_2191316
Registrant Name: Alexei Aniskevich
Registrant Organization: N/A
Registrant Address1: Sopruse pst 15
Registrant City: Tallinn
Registrant State/Province: Harjumsa
Registrant Postal Code: 50707
Registrant Country: Estonia
Registrant Country Code: EE
Registrant Phone Number: +372.715713
Registrant Email: alex@coolsearch.biz
85.255.116.181: your-searcher.com
Domain Name: YOUR-SEARCHER.COM

Registrant:
N/A
Alexei Aniskevich (alex@coolsearch.biz)
Sopruse pst 15
Tallinn
Harjumsa,50707
EE
Tel. +372.715713
Let us continue with more of this puzzle...

Let's explore gigasearch.biz a bit more:

69.50.163.9: gigasearch.biz

We did find some similar scraping in this range:
69.50.190.242 "Snoopy v1.2"
Actually, the range 69.50.*.* has a ton of scraping so seeing a link to this scraper and the Snoopy user again yet again was no surprise.

GigaSearch.biz is hosted on our old friends Intercage which hosted Scraper #4 and Scraper #6 which I think may be all the same scraper as everything just keeps linking them together from host to host, some similar IP ranges and the same user agent. Nothing concrete but all the circumstantial evidence is overwhelming that they may be somehow related.

Most amusing is all the links on gigasearch.biz redirect to find.fm, and this relationship could be interesting but I'm getting sick of chasing this scraper / spammer at this point.

The host of our busted scraping pals #4, #6 and #11:
OrgName: InterCage, Inc.
OrgID: INTER-359
Address: 1955 Monument Blvd.
Address: #236
City: Concord
StateProv: CA
PostalCode: 94520
Country: US
NetRange: 69.50.160.0 - 69.50.191.255
Let's see what else is on the Gigasearch.biz server:

69.50.163.9: blanksearch.biz

This domain is NSFW with raw porn all over it.
Domain Name: BLANKSEARCH.BIZ
Domain ID: D6761115-BIZ
Sponsoring Registrar: ESTDOMAINS INC
Sponsoring Registrar IANA ID: 832
Domain Status: ok
Registrant ID: DI_3009123
Registrant Name: Ivars Kaupers
Registrant Organization: No
Registrant Address1: Skirgailos 15
Registrant City: Kaunas
Registrant Postal Code: 75128
Registrant Country: Lithuania
Registrant Country Code: LT
Registrant Phone Number: +370.571689
Registrant Email: ivars@blanksearch.biz

69.50.163.9: tgp-porno.net

This site brings up another of the same old porn links again.
Domain Name: TGP-PORNO.NET
Registrant:
N/A
Alexei Aniskevich (alex@coolsearch.biz)
Moisavahe 64-1
Tartu
Tartumsa,50707
EE
Tel. +372.715713
Last but not least, the server with find.fm hosts a few other garbage domains with the same links about pills and porn on them all, with "find.fm" on the bottom of the page which was a big shocker as well:

Domains on 64.111.196.119 (Find.fm)

adultwebfind.com
carwebsearch.com
cashwebsearch.com
dmns4sale.com
gamblingwebsearch.com
pharmacywebsearch.com
travelwebsearch.com
your-needs.info

Well, that's all for now.

Needless to say, they can't hide for long as they leave a slimey trail that can be followed.

Scrape me again assholes, let's unravel the rest of your bullshit sites.

Wednesday, August 23, 2006

Slow Blog Week

Sorry if you aren't getting your daily dose of bad bots and the usual run-on ranting sentences packed full of expletives but I've been busy the last few days catching up on some accounting and doing some work on software and websites.

If you really need your 'fix' you can catch up on the latest of the Nutchies that are still harassing the shit out of me from a link from Doug Cutting's site.

As you can tell in the final comment posted, and a few previous comments, that I'm losing patience with this bunch of crawling-the-web-is-our-right cultists.

Saturday, August 19, 2006

Pyrex Detonates Dinner

Tonight we had to go out to dinner because the Pyrex dish in the microwave detonated like a small bomb. It did NOT just fall apart like they claim, I heard the noise 3 rooms away when the dish went >BANG!< in the microwave and rocked it.

OK, I'll grant you that this occurance was possibly a rapid temperature change caused by cooking in the microwave.

However, a couple of years ago we had a Pyrex 11"x17" 6-month old baking dish sitting in a cabinet that hadn't been used in many days and it just exploded with such force that the door to the cabinet was blown open, shards of glass flew across the room and there was glass even on the shelf ABOVE where this dish was sitting.

The fine people of World Kitchen (scroll down to Pyrex Responds) would like to have you believe that these dishes don't explode, so perhaps they would like to explain why I had to empty a shelf above the dish that didn't explode to clean up the shards of glass, or how glass landed in the carpet almost 8' away, or sprayed across our kitchen.

For my part, I'm done with Pyrex as 2 exploding dishes in a couple of years is just too dangerous to deal with.

We'll be replacing them with metal for baking or some alternative for the microwave in every case possible.

In the meantime, we're immediately moving all of our Pyrex to lower shelves by the floor as a safety precaution as I'd rather have glass in my foot instead of glass in my face from this volatile cookware.

[Update]

After doing some research we found this tidbit:

"Pyrex kitchen products produced by World Kitchen are no longer made from borosilicate glass, and their packaging indicates that they must never be used over a flame, on stove tops, under a broiler, or in a toaster oven. Hence the exploding glassware. Pre-1998 Pyrex tends to crack into large pieces rather than shattering. Keep your pre-1998 Pyrex and treat it nicely so it will last. Anything you buy now may blow up in your face! "
The 11x17 baking dish was post-1998 but obviously because of it's size wouldn't fit in the toaster over or even the microwave, and we NEVER use these for stove top cooking.

The smaller dish that cracked up in the microwave last night we've had since about '91 so it was the old-style Pryex and did crack into larger pieces, but did so quite loudly.

I'm just glad it broke in the closed microwave and not when my wife opened the door or was removing it from the microwave, that would've been a bad situation.

No more Pyrex for us, it's gone.

SCRAPER BUSTED #10 - Fashion Designs Made for AdSense

We have some fuckers scraping from, come on say it - your favorite source of scraping, Everyones Internet or ev1.net for those of you new to this shit.

Some real dumb fuckers too that put "User-Agent:" in front of the user agent just in case we weren't smart enough to know the user agent was a user agent which was reason enough to block all the assholes that do that shit in the first place.

Here's the stealth crawler:

66.98.132.68 "User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.7) Gecko/20050414 Firefox/1.0.3"
Here's a snippet from MSN:
Spring Fashion 2003 Check Out The Backpacker Watercolor Palette Or The ...
...by political and economic decentralisation, especially in countries with mixed and command economies. spring fashion 2003 Your IP Address: 66.98.132.68 User Agent:
devoll.roswellspringcatalog.info/spring-fashion-2003.html8/18/2006

Stupid fuckers, did you not think we would catch your made for adsense scraper bullshit?

Friday, August 18, 2006

Stealth Crawler from HP

Something stealth came crawling from HP, perhaps it's part of that PlanetLab fiasco, asked for 11 pages and 3 images, very bizarre behavior.

Asked for the 'about' page 3 times, robots.txt 2 times, index 2 times, and the privacy page, so definitely not a human and not a terribly clever bot either.

192.6.19.203 - "GET /robots.txt " "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
192.6.19.203 - "GET / " "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
Reverse DNS says:
nslookup 192.6.19.203
name = cache2.nlanr.hpl.hp.com
What's up HP, mind sharing your intentions with this activity?

Thursday, August 17, 2006

Found ACEZ in Placez, What is it?

Whatever ACEZ is, it always precedes what appears to be an actual page view so perhaps it's a filtering program of some sort. Additionally, it seems to only come via a proxy server.

Doesn't appear to be a bot but it's very strange.

Here's a few sightings:

220.227.148.74 "ACEZ" VIA=1.0 localzeesports.localzs.com:8080 (squid/2.5.STABLE1) FORWARD=192.168.10.178

212.138.113.12 "ACEZ" VIA=1.1 proxy2 (NetCache NetApp/5.3.1R4), 1.0 cache2.ruh FORWARD=213.165.59.253

212.138.113.13 "ACEZ" VIA=1.1 proxy2 (NetCache NetApp/5.3.1R4), 1.0 cache3.ruh FORWARD=213.165.59.253

212.138.47.17 "ACEZ" VIA=1.1 proxy2 (NetCache NetApp/5.3.1R4), 1.0 cache7.ruh FORWARD=213.165.59.253

212.138.47.14 "ACEZ" VIA=1.1 proxy2 (NetCache NetApp/5.3.1R4), 1.0 cache4.ruh FORWARD=213.165.59.253

Here's a sample of how it appears in a log file before a page is read:
212.138.113.13 - "GET /page1.html" "-" "ACEZ"
212.138.113.13 - "GET /page1.html" "" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"
Any additional info would be nice if anyone knows anything about it.

Evolving Stealth Bots

Just in the last few weeks I've been seeing some really odd hits on robots.txt from things claiming to be browsers, loading images like a browser, the whole nine yards.

Bot #1 - Post-crawl Robots.txt Reader

What I'm seeing is that instead of looking at robots.txt upfront, which is a trigger to shut down a bot, I'm seeing robots.txt read after one or two pages is read. That way, they can snoop my robots.txt file but not do it first therefore avoiding being stopped while collecting a safe page or two in order to find out what my pages are for a future crawls.

That's my theory and I wouldn't have considered this the case except I've seen the exact same behavior multiple times.

Time to start setting some new traps and see who crawls with the information gathered from these probes.

Bot #2 - 3 Phase Crawler

Next on the list is a stealth bot that looks like it's either taking a screen shot on the first page or downloading images just to try and trick my software into thinking it's human.

This beast does the following:

  1. Reads robots.txt with a blank user agent string
  2. Loads the home page as Linux Firefox and downloads all associated images which appears to be taking a screen shot
  3. Crawls the rest of the pages on the site disguised as Internet Explorer
Bot #3 - Blank User Agent Probe

Here's something amusing with what appears to be a Ukrainian spider that downloaded a linked image to my website as Internet Explorer and 4 seconds later hit robots.txt as an anonymous user agent.
82.207.93.90 - [12:20:47] "GET /banner.gif" "http://www.someotherwebsite.com" "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT 4.0)"

82.207.93.90 - [12:20:51] "GET /robots.txt" "-" "-"
This may be related to Bot #2 above, not sure, but I've seen a few hits like this where they follow the link and peek to see what's allowed and don't go any further.

Very odd.

Tuesday, August 15, 2006

Link Checkers Don't Understand

Having a few conversations that are going nowhere with some link checker sites.

ME: "Sorry but I had to block your link checker as you're never going to find what you want as I can't allow any of you to crawl 40K pages. Would you mind just telling me what you want to find and I can tell you exactly where it is?"

Link Checkers: "Just point us to your links page with robots.txt"

ME: "The whole site is links, it's a directory, and robots.txt is EXCLUSION only, not INCLUSION, so I can't tell you where to crawl only where NOT to crawl which is impractical with 40K pages anyway."

Link Checkers: "We stop after X pages anyway."

ME: "You're still wasting my bandwidth as the odds of finding what you're looking for in the top level pages is real slim. How about telling me who you want in the referrer field and I'll just redirect your crawler to the exact page you need."

Link Checkers: "Error, does not compute, too logical, error, error, erroooooooorrrrrr...."

So there you have my current state of impasse with the link checking community.

As soon as they can come up with a compromise I'll unblock them, but until then NADA PAGE!

FIRST LOOK - GenericBot-ax 0.85 at SurfControl

It's always cool to have an EXCLUSIVE on a new bot caught fresh in the traps this morning.

This little beast was crawling from SurfControl's IP range:

195.244.16.1 "GenericBot-ax 0.85"
Here's the 411 on the IP address:
inetnum: 195.244.16.0 - 195.244.17.255
netname: SURFCONTROL
descr: SurfControl PLC
country: GB
e-mail: karl.jones@surfcontrol.com
Didn't ask for robots.txt and asked for the home page 3 times in a row, about a minute apart.

What they didn't expect was their SurfControl met MY surf control and they got a swift kick in the ass.

NO DATA FOR YOU!

Buh bye.

Multiple Scrape Attempts from Google IPs?

OK, anyone can shed any light on this would be nice, web accelerator may?

Had a batch of "Avant Browser" requests, none got answered because of this SNAFU request early on that tripped the bot trap, yet they just kept coming:

64.233.173.89 - "GET /#top" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Avant Browser; Avant Browser; .NET CLR 1.0.3705)"
Google didn't even respond properly to reverse DNS, sloppy shit:
nslookup 64.233.173.89

** server can't find 89.173.233.64.in-addr.arpa: NXDOMAIN
But it's certainly a Google IP:
whois 64.233.173.89

OrgName: Google Inc.
OrgID: GOGL
Address: 1600 Amphitheatre Parkway
City: Mountain View
StateProv: CA
PostalCode: 94043
Country: US

NetRange: 64.233.160.0 - 64.233.191.255
Then look at THIS one also from Google, what the hell?
72.14.194.19 - "GET /robots.txt" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.6) Gecko/20060728 Firefox/1.5.0.6"
Same reverse DNS problem:
nslookup 72.14.194.19

Non-authoritative answer:

*** Can't find 19.194.14.72.in-addr.arpa.: No answer
Just to make sure it wasn't my servers, I checked DNSSTUFF.com, same result.

Yet, it's Google:
whois 72.14.194.19

OrgName: Google Inc.
OrgID: GOGL
Address: 1600 Amphitheatre Parkway
City: Mountain View
StateProv: CA
PostalCode: 94043
Country: US

NetRange: 72.14.192.0 - 72.14.255.255
OK, someone from Google got a clue what in the hell is going on?

Anyone?

This is unacceptable whatever it is!

Monday, August 14, 2006

Another Yahoo Proxy Hijacking

Since our old buddy John think's I have a bad attitude about proxy sites and they shouldn't be blocked then we'll use him as an example and replace the actual data found in Yahoo with John's website.

John, how would you like your site being Hijacked in Yahoo like this?

  1. ... Yahoo has crawled via proxy IP 74.52.14.138 to hijack your site John, deal with it!
    Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp) ...
    gizliweb.com/g/o.web/010010A/http:/www.johnon.com
I doubt this will change his mind about proxies but those Google Ads on the top of his page sure look pretty!

Spam Gilad, It's All His Fault

As the website proclaims:

Gilad, this is all your fault!!!

Technically, just leaving the unattended guestbook online full of nothing but spam was Eli's fault.

This wouldn't even be so amusing except the guestbook is full of spam links pointing to our favorite scraper malware sites.

Take a peek in Yahoo to see the scope of the spamming just for one of the domains such as xanax-shop.info.

I let Yahoo know about the list of dirty scraper malware dogs last week so it will be amusing to see how long they remain in the index, especially since many of the domains in the list try to do harm to surfers.

Thanks to Olliver for pointing out this link titled GooglePray over at Spamhuntress' site.

Saturday, August 12, 2006

China's iaskspider evolution and related crawling

Here's the latest on this little bullshit bot iaskspider from China.

Previously it crawled from 2 d-blocks using a simple name:

219.142.118.56 "iaskspider"
219.142.118.57 "iaskspider"
219.142.78.60 "iaskspider"
219.142.78.85 "iaskspider"
Now it claims to be Internet Explorer:
219.142.118.66 - "Mozilla/5.0 (compatible; iaskspider/1.0; MSIE 6.0)"
And appears to be checking anonymously for spider traps or some shit:
219.142.118.75 - "GET /robots.txt HTTP/1.0" 200 146 "-" "-"
219.142.118.70 - "GET /robots.txt HTTP/1.0" 200 146 "-" "-"
219.142.118.69 - "GET /robots.txt HTTP/1.0" 200 146 "-" "-"
219.142.118.54 - "GET /robots.txt HTTP/1.0" 200 146 "-" "-"
Don't ask me, I only block them, no clue what the hell is going on.

Friday, August 11, 2006

EDI Edacious Bullshit and Yeti from Korea

Don't know much about this piece of shit crawling from Korea except it looks at robots.txt and operates from 2 d-blocks:

222.231.50.166 "EDI/1.2.0 (Edacious & Intelligent Web Crawler)"
222.231.50.161 "EDI/1.2.0 (Edacious & Intelligent Web Crawler)"
222.231.42.10 "EDI/1.2.0 (Edacious & Intelligent Web Crawler)"
222.231.42.14 "EDI/0.9.3 (Edacious & Intelligent Web Crawler)"
Something in the next d-block in this bad neighborhood:
222.231.21.62 "Yeti"
222.231.21.55 "Yeti"
222.231.21.122 "Yeti"
222.231.21.122 "Yeti"
Bullshit of a feather flocks together.

No Cookies for Decommissioned Junction

I've been dumping affiliate programs lately because of the abysmal rate of cookie tracking and ever decreasing affiliate income vs. the PPC steady income rate.

This isn't terribly accurate as my script doesn't know if a cookie was accepted until the second page view. However, out of 3489 visitors in a sample I just took, that looked at 2 or more pages, the cookies were disabled by 22% of the returning visitors.

5926 Visitors
3489 Visitors > 1 Page View
2710 Cookies Enabled
779 Cookies Rejected
It doesn't take a rocket scientist to figure out that 22% of disabled cookies make affiliate programs relatively unattractive as almost 1/4 of the returning visitors wouldn't give me credit for anything they buy.

Out of the total visitors it's only 13%, but we don't know how many of the 2437 visitors that only viewed a single page had cookies enabled, but my suspicion is it's closer to 20%.

Taking into account that these numbers are POST bot filtering, so all of the bots that were blocked or banned didn't get included, this leads to two possible conclusions:

1) a lot more people aren't accepting cookies than previously thought or
2) there's a lot more low impact stealth bot activity than even I suspected.

Which is the right answer?

I'm sure the truth lies somewhere in the middle.

Thursday, August 10, 2006

Telemarketing SEO Assholes

I was sitting here minding my own business today and the phone rang.

Normally, I would let "UNKNOWN CALLER" roll to voicemail but today I lost my mind and answered the damn phone.

ME: "Hello?"

SLIMEBALL: "Hi, do you own domain XYZ.COM?"

ME: "Um, yes I do, why do you ask?"

SLIMEBALL: "We have been looking at XYZ.COM and it's a strong website but doesn't have very good presence in the search engines and we'd like to offer our help."

About now the hair stands up on the back of my neck...

ME: "Excuse me? I rank very well in search engines and have a ton of top 10 longtail keywords"

This alone should be a tip that I know something about this shit...

SLIMEBALL: "Well, our research report shows you're lacking in many major keywords and we could help..."

ME: "Are you out of your mind? I get 500,000 visitors a month, how in the hell is that lacking?"

SLIMEBALL: "Um, well, we don't show you on the main..."

>CLICK!<

I just didn't have the heart to start yelling and screaming profanity at this slimeball as it was just too early in the morning.

My suspicion is they would probably sink my site so low in the SE's that I'd have to get a real job as my days of webmaster welfare would be over.

Fuck it, I'll stick with my "lacking listings" thank you very much.

Wednesday, August 09, 2006

Kudos on the Google Dance, Stellar AdSense Support, and my Google Gift

Sometimes we complain about AdSense support being slow and unresponsive but I have to give kudos on a same day response yesterday, and that was in the midst of Google preparing for the Google Dance party.

Now the Google Dance party was off the hook, the band kicked ass, it was rock'n baby!

Enjoyed the various snacks, they were free and I never complain about free food.

The only thing I didn't try which looked good were the Pavlov's Dogs as I was snacked out and needed to save room for beer.

Google gets more love from me as their beer selections this year were far superior to last year, less cat piss and more beer for real beer drinkers. Last year I had to literally scour the place to find one lone tap with something that wasn't clear yellow cat piss that was tucked inside a building but this year the good beer was everywhere.

Just in case you haven't figured it out, I'm in love with Google at the moment,

I still use all the Google gifts I got last Christmas as the Google wireless mouse and USB expansion port are permanent fixtures on my laptop.

Last week when I went to get a new set of business cards printed I used my Google memory stick/keychain to take them to Kinkos and get them printed and cut while I waited, that was way cool too, no floppies, nothing.

My wife was poking fun at me and my giddy behavior with the memory stick "Have you never encountered technology before? Is this your first time?". Well, technically it was my first time handing someone my keychain to get something printed opposed to original copies or a floppy disk, just struck me as being cool.

I felt like Jack Bauer from 24 running into Kinkos:

"Quick Chloe, download the encrypted data off this chip that was just recovered in a covert sting and use our blowfish decryption algorithms to extract these business cards..."

Ah well, been there, done that, now it's old hat.

Besides, who can say anything bad about Google?

They give me free money every month, they give me free web traffic, they give me free gifts and then invite me into their building and give me free food, booze and entertainment.

It's almost like being a rich kid living off the family AdSense trust fund ;)

The only complaint I had about last nights Google Dance party is my feet hurt like hell by the time it was over!

Wait, I almost forgot, they were giving away t-shirts but the sizes were limited to LARGE, SMALL and WOMEN's. C'mon Google, did you take a serious look at how many 2X and 3X people you had waddling around the 'Plex last night?

What in the heck would I do with a LARGE t-shirt, dust my house with it?

Other than that one minor glitch, good job Google, loved it!

SCRAPER BUSTED #9 - Umax is baaaack

This is déjà vu day in the scraper busting dept. as Umax is back with a new virulent website.

BTW, if you want to read some funny misguided shit, this guy wants people to boycott the UMAX the scanner company because of something unrelated, like this spamming virus site maker that's the topic of this post.

What a screwball, sheesh.

WARNING - DO NOT GO TO THIS SITE!

IT WILL ATTEMPT TO INFECT YOUR PC WITH AN EXPLOIT!

Remember, I'm a trained professional, so don't try this site at home as this is some nasty shit.

However, if you're stupid enough [and most of you are] to attempt to access this site then use some goddamn common sense and disable your javascript and maybe java in your browser first or you might end up in a world of hurt.

For those of you real dumb fuckers, I mean the dumb as a pet rock variety, you'll get Trojan.ByteVerify installed on your machine if you visit these sites [see list at bottom] without proper precaution so don't blame me as YOU HAVE BEEN WARNED!

Crawler Info:
IP Address: 209.172.60.19 [ip-209-172-60-19.reverse.privatedns.com]
User Agent: lwp-trivial/1.41
Site info:
umax-ppc.net (66.199.247.42)
This is on the same server and host as the last reported site, but just in case you're too fucking lazy to click the link about and look it up for yourself it's repeated below.

Not sure this is even real information about this asshole, as other registrations say Russia, there's a shock, but they all seem to have FREEYAHO LLC in common.

American asshole information:
Registrant:
Sid Wongvorakul
979 Rutland Dr
Memphis, Tennessee 78243
United States

Registered through: FREEYAHO LLC.
Domain Name: UMAX-PPC.NET
Created on: 15-Dec-04
Expires on: 15-Dec-07
Last Updated on: 12-Jul-06

Administrative Contact:
Wongvorakul, Sid sidfeehit@yahoo.com
979 Rutland Dr
Memphis, Tennessee 78243
United States


Technical Contact:
Wongvorakul, Sid sidfeehit@yahoo.com
979 Rutland Dr
Memphis, Tennessee 78243
United States


Domain servers in listed order:
NS1.NEED-SITE.COM
NS2.NEED-SITE.COM
Russian asshole information:
Domain Name: SEHUNTRESS.BIZ
Domain ID: D10559406-BIZ
Sponsoring Registrar: WILD WEST DOMAINS, INC.
Sponsoring Registrar IANA ID: 440
Domain Status: clientDeleteProhibited
Domain Status: clientRenewProhibited
Domain Status: clientTransferProhibited
Domain Status: clientUpdateProhibited
Registrant ID: GODA-013273608
Registrant Name: DMITRIY SOLDATENKO
Registrant Organization: Freeyaho LLC.
Registrant Address1: a-n 262
Registrant City: Ulan-Ude
Registrant State/Province: Ru
Registrant Postal Code: 670042
Registrant Country: Russian Federation
Registrant Country Code: RU
Registrant Phone Number: +790.25651263
Registrant Email: soldde@mail.ru
Host information:
OrgName: EZZI.NET
OrgID: EZZIN
Address: AccessIT - Hosting Services
Address: 75 Broad Street, Suite 1902
City: New York
StateProv: NY
PostalCode: 10004
Country: US

ReferralServer: rwhois://rwhois.s2.ezzi.net:4321
NetRange: 66.199.224.0 - 66.199.255.255
The rest of this prolific virus spamming assholes domains hosted on the same box:
1day.us
adsadult.com
adscom.us
adsname.com
alprazolam-xanax.com
art-xxx.com
baikal-guide.com
baikal-hotel.com
baikal-hotel.info
baikal-hotel.net
baikal-info.com
baikal-shop.com
baikal-tour.biz
baikal-travel.info
baikalguide.com
baikalhotel.com
baikalhotel.info
baikalhotel.net
baikalshop.info
baikalsk.com
baikalsk.info
baikalsk.net
bbsporn.com
board-online.com
board-online.net
dimattic.com
dsdomain.com
forum-online.biz
free-hit.com
free-virgin-pic.com
freeyaho.com
hotel-baikal.com
hotel-baikal.info
hotel-shop.info
hotelbaikal.com
hotelbaikal.info
hotelbaikal.net
info-baikal.com
lake-baikal.info
lakebaikal.info
need-site.com
nude-teacher.com
online-info.info
payday-loan-top.com
pharmacy-affiliate-program.com
popular-screen-savers.com
porn-samples.com
porn-teacher.com
porn-teen-pic.com
porno-sample.com
ppc-se.biz
ppc-se.com
ppc-se.info
ppc-se.net
qoclick.com
qoclick.net
reseller-porn.com
sampleclip.net
sehuntress.biz
sehuntress.com
sehuntress.info
sehuntress.net
seohuntress.com
sex--free.com
sex--x.com
sexy-teacher.net
showavailable.com
solo-teens.com
specific911.biz
specific911.com
specific911.info
specific911.net
specific911.org
top-10-shop.com
top-new-affiliate-programs.com
umax-forum.com
umax-ppc.com
umax-ppc.net
umax-se.biz
umax-se.com
umax-se.info
umax-se.net
umax-se.org
umax-search-ppc-se-board.com
umax-search-ppc-se.com
umax-search-ppc.com
umax-search-se.com
umax-search-search-engine.com
umax-search.biz
umax-search.com
umax-search.info
umax-search.net
umax.org
umaxforum-umax-forum.com
umaxppc.com
umaxppc.net
umaxppcsearch.com
umaxse.biz
umaxse.com
umaxse.info
umaxse.net
umaxse.org
umaxsearch-ppc-se.com
umaxsearch-ppc.com
umaxsearch-se.com
umaxsearch-search-engine.com
virgin-sexy.com
webmasterdiscuss.com
weekly-pay-ppc-se.com
weekly-pay.com
weekly-teens.com
work-at-home-top.com
xanax-shop.info
yula.us
arshan.info
If you think I have a bad attitude in this post, you're very perceptive, as this fucker really pisses me off more than the usual garden variety scraper and hosting companies that allow this shit on their premises make my blood boil.

I'm trying to resist calling the whole lot of them a bunch of cocksucking assholes, but I think I'm losing that battle..




SCRAPER BUSTED #8 - Categorico Strikes Again from Canada

This is the same bunch of fucknuts I busted previously as Vipse Corp and Categorico with a new twist as this domain is ShopNews.com and claims to be registered to some fucker in Canada, not Italy, but the same Adsense account: "Advertise on www.categorico.com".

Scraping data:

IP Address: 66.240.172.2 [www130.mediaserve.net.]
User Agent: InetURL/1.0
Site data:
shopsnews.com (66.240.172.29)

Domain Name: SHOPSNEWS.COM
Registrant:
Logan Vernissa
306, 809-890 Crowfoot Cres.
Calgary, Alberta T7G 7T4
CA
507-454-0941
Fax:101-787-4348
The scraping and server are from the same d-block hosted here:
OrgName: Broadspire Inc.
OrgID: BRSP
Address: 10200 Sepulveda Blvd. Suite 160
City: Mission Hills
StateProv: CA
PostalCode: 91345
Country: US

NetRange: 66.240.128.0 - 66.240.191.255
You know what to do, block these fuckers and cut them off at the knees.

Robot MKDB From Oxford

No clue what the fuck this is but the reverse DNS suggests that this shit escaped from an Oxford computer science lab.

129.67.94.182 [marina.robots.ox.ac.uk.] requested 1 pages as "mkdb"
Didn't ask for robots.txt whatever it was.

Blah

Yahoo-Test/4.0 fails pop quiz

It wasn't Slurp so they got a error message, test failed, sorry Yahoo!

216.145.49.15 - "GET /robots.txt HTTP/1.0" 200 146 "-" "Yahoo-Test/4.0"
216.145.49.15 - "GET / HTTP/1.0" 200 1173 "-" "Yahoo-Test/4.0"
Study harder next time.

Tuesday, August 08, 2006

Adsense Scraper with CACHE pages

In a new twist, here's a scraper with CACHE pages pretending he's Google.

Easy target for a flood of DMCA notices...

IP: 66.246.252.172
User Agent: ""
Here's the fuckhead's information:
Registrant:
Dragulescu Radu
Victoriei, bl.7,
sc. D, ap. 3
Timisoara, Timis 01900
Romania

Registered through: GoDaddy.com, Inc. (http://www.godaddy.com)
Domain Name: PHOTOIDEAS.NET
Created on: 18-Dec-05
Expires on: 18-Dec-07
Last Updated on: 01-Aug-06

Administrative Contact:
Radu, Dragulescu office@2x.ro
Victoriei, bl.7,
sc. D, ap. 3
Timisoara, Timis 01900
Romania
40726367488

Technical Contact:
Radu, Dragulescu office@2x.ro
Victoriei, bl.7,
sc. D, ap. 3
Timisoara, Timis 01900
Romania
40726367488

Domain servers in listed order:
NS1-FRANKLIN.NSWEBHOST.COM
NS2-FRANKLIN.NSWEBHOST.COM
The hosting appears to be thru nac.net:
OrgName: Net Access Corporation
OrgID: NAC
City: Parsippany
StateProv: NJ
PostalCode: 07054
Country: US
I think they're gonna get a letter about this asshole...

Inhoster Blog Spam Haven Servers Blocked

Inhosting is just filthy with blog spammers which is bizarre as usually I find a mix of activity on dedicated servers but this place seems to be overflowing with nothing but spammers and just one scraper, Snoopy.

I'm positive they are all spammers as every IP address listed below, except Snoopy, ONLY accessed my post form on a specific server, nothing else.

They host some of the usual garden variety bullshit spammers and Snoopy the scraper:

85.255.116.178 "Snoopy v1.2" "/"
85.255.117.218 "PussyCat 1.0, Murzillo compatible"
85.255.117.222 "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4"
85.255.117.226 ""
85.255.118.106 "PussyCat 1.0, Murzillo compatible"
85.255.118.114 "PussyCat 1.0, Murzillo compatible"
Then they have a few of the amazing changing user agent spammers from this IP sorted by user agent for your viewing pleasure:
85.255.117.250 "Mozilla/4.0 (compatible; MSIE 4.0; MSN 2.6; Windows 95; Gateway2000)"
85.255.117.250 "Mozilla/4.0 (compatible; MSIE 4.0; Windows 95)"
85.255.117.250 "Mozilla/4.0 (compatible; MSIE 4.01; Windows 95)"
85.255.117.250 "Mozilla/4.0 (compatible; MSIE 5.01; Windows 95)"
85.255.117.250 "Mozilla/4.0 (compatible; MSIE 5.01; Windows 95; USA On-Site)"
85.255.117.250 "Mozilla/4.0 (compatible; MSIE 5.01; Windows 98)"
85.255.117.250 "Mozilla/4.0 (compatible; MSIE 5.01; Windows 98; 981)"
85.255.117.250 "Mozilla/4.0 (compatible; MSIE 5.01; Windows 98; QXW0332q)"
85.255.117.250 "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"
85.255.117.250 "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0; DT)"
85.255.117.250 "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"
85.255.117.250 "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.9) Gecko/20020311"
85.255.117.250 "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.0rc1) Gecko/20020417"
85.255.117.250 "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.0rc2) Gecko/20020510"
85.255.117.250 "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.0rc3) Gecko/20020523"
85.255.117.250 "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.1a) Gecko/20020611"
85.255.117.250 "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.1b) Gecko/20020721"
85.255.117.250 "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2a) Gecko/20020910"
85.255.117.250 "Opera/6.01 (Windows 98; U) [en]"
85.255.117.250 "Opera/6.04 (Windows 2000; U) [en]"
85.255.117.250 "Opera/6.04 (Windows 98; U) [en]"
85.255.117.250 "Opera/6.04 (Windows XP; U) [en]"
85.255.117.250 "Opera/7.0 (Windows 2000; U) [en]"
85.255.117.250 "Opera/7.0 (Windows NT 5.0; U) [en]"
85.255.117.250 "Opera/7.02 Bork-edition (Windows NT 5.0; U) [en]"
Another of the same rotating user agent shit on a different IP
85.255.117.251 "Mozilla/4.0 (compatible; MSIE 5.01; Windows 95; USA On-Site)"
85.255.117.251 "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"
85.255.117.251 "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"
85.255.117.251 "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.7) Gecko/20011221"
85.255.117.251 "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.0.0) Gecko/20020530"
85.255.117.251 "Opera/7.02 Bork-edition (Windows NT 5.0; U) [en]"
And YET another that didn't hit as often
85.255.117.253 "Mozilla/4.0 (compatible; MSIE 4.0; MSN 2.6; Windows 95; Gateway2000)"
85.255.117.253 "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.7) Gecko/20011221"
85.255.117.253 "Opera/6.04 (Windows 2000; U) [en]"
For the grand finale, a D-block of Firefox Linux spammers:
85.255.118.82 "Mozilla/5.0 (X11; U; Linux i686; ru; rv:1.8.0.3) Gecko/20060425 SUSE/1.5.0.3-7 Firefox/1.5.0.3"
85.255.118.83 "Mozilla/5.0 (X11; U; Linux i686; ru; rv:1.8.0.3) Gecko/20060425 SUSE/1.5.0.3-7 Firefox/1.5.0.3"
85.255.118.84 "Mozilla/5.0 (X11; U; Linux i686; ru; rv:1.8.0.3) Gecko/20060425 SUSE/1.5.0.3-7 Firefox/1.5.0.3"
85.255.118.85 "Mozilla/5.0 (X11; U; Linux i686; ru; rv:1.8.0.3) Gecko/20060425 SUSE/1.5.0.3-7 Firefox/1.5.0.3"
85.255.118.86 "Mozilla/5.0 (X11; U; Linux i686; ru; rv:1.8.0.3) Gecko/20060425 SUSE/1.5.0.3-7 Firefox/1.5.0.3"
85.255.118.130 "Mozilla/5.0 (X11; U; Linux i686; ru; rv:1.8.0.3) Gecko/20060425 SUSE/1.5.0.3-7 Firefox/1.5.0.3"
85.255.118.132 "Mozilla/5.0 (X11; U; Linux i686; ru; rv:1.8.0.3) Gecko/20060425 SUSE/1.5.0.3-7 Firefox/1.5.0.3"
85.255.118.133 "Mozilla/5.0 (X11; U; Linux i686; ru; rv:1.8.0.3) Gecko/20060425 SUSE/1.5.0.3-7 Firefox/1.5.0.3"
85.255.118.134 "Mozilla/5.0 (X11; U; Linux i686; ru; rv:1.8.0.3) Gecko/20060425 SUSE/1.5.0.3-7 Firefox/1.5.0.3"
Block block block block...

Here's the range of troublemaker IPs to block
netname: INHOSTER
inetnum: 85.255.112.0 - 85.255.127.255
They also have this range but I don't have any activity that has been tracked from here:
netname: INHOSTER
netnum: 195.95.218.0 - 195.95.219.255
Enjoy the silence with the fucking spammers gone.

Taiwan Scraping from C and D-Blocks

Didn't check the archive file to see if this was more widespread because as this was a single instance today of a coordinated scrape attempt from multiple IPs at the same time.

The D-block scraping attempt from "61.66.36" was nothing new as small blocks of scraping IPs turn up all the time.

However, the C-block scraping from "218.162." at the same has implications as this normally would've been harder to identify in small 1-4 page bursts.

The scraping C-block:

61.66.36.185 [adsl-61-66-36-185.TC.sparqnet.net.] requested 2 pages as "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"
61.66.36.186 [adsl-61-66-36-186.TC.sparqnet.net.] requested 3 pages as "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"
61.66.36.187 [adsl-61-66-36-187.TC.sparqnet.net.] requested 3 pages as "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"
The scraping D-block:
218.162.169.65 [218-162-169-65.dynamic.hinet.net.] requested 1 pages as "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"
218.162.170.209 [218-162-170-209.dynamic.hinet.net.] requested 3 pages as "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"
218.162.172.171 [218-162-172-171.dynamic.hinet.net.] requested 4 pages as "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"
218.162.175.60 [218-162-175-60.dynamic.hinet.net.] requested 3 pages as "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"
218.162.179.74 [218-162-179-74.dynamic.hinet.net.] requested 1 pages as "Mozilla/4.0 (compatible ; MSIE 6.0; Windows NT 5.1)"
Looks like they're getting smarter and your average webmaster will never spot this kind of activity.

Time to block Taiwan entirely?

Sunday, August 06, 2006

SES San Jose 7-10 2006

Anyone I know going to be there this week?

I'm heading down to the speakers party tonight (oops, did I let out a surprise) so if anyone I know is there tonight maybe we'll pound a brew or two together.

Bet you can't figure out which session I'll be speaking at...

Google Crawls Thru Yahoo Japan

This odd Google crawling thru Yahoo Japan occurrence must be via some sort of proxy or translation server, no clue, but this shit is weird.

211.14.9.244 requested 2 pages as "Mediapartners-Google/2.1"

211.14.9.240/28
YAHOO-NET
Yahoo Japan Corp.
Makes you scratch your head doesn't it?

Perhaps you have lice or dandruff, stay away from me...