Here's a big shocker that 2 days after I build my new spider trap Google ignores my robots.txt entry and snares itself in the trap.
Simple robots.txt, nothing hard about this test:
User-agent: *Sure enough my spider trap log for that page shows:
Disallow: /rogue_bots.html
SPIDER IP=66.249.65.70Bad, bad Google, tsk tsk.
SPIDER AGENT=Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
No comments:
Post a Comment