Back in November I wrote an open letter to the Amazon AWS Group about trying to get them to stop using the default user agent "Java/1.5.0_09".
Today I noticed that they gave me a clear response to my open request:
216.182.228.223 [domU-12-31-33-00-02-01.usma1.compute.amazonaws.com.]Oh yes, prefixing "Java/1.5.0_09" with an MSIE 6.0 user agent is MUCH better.... NOT!
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; T312461) Java/1.5.0_09"
Must've been getting blocked from crawling too many sites that block the default Java UA.
Nice try guys, but that's really fucking lame.
1 comment:
I kicked amazonaws.com to the curb a While back..
Here's one:
ec2-72-44-51-181.z-1.compute-1.amazonaws.com
and it's user agent is this:
Agent: LeapTag (compatible; Mozilla 4.0; MSIE 5.5; http://beta.leaptag.com/?p=linux2&v=0.8.4.trunk.r4295)
It used that to pull the robots.txt file and then used this to "try" to pull an article:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.8) Gecko/20061025 Firefox/1.5.0.8
So,,,,, I don't believe it's a legit bot/user or whatever coming from amazonaws.com.
It changes as it sees fit to scrape whatever it's looking for..
Post a Comment