A few weeks ago I read Jem’s blog “Live.com Fake SERP Referrals Are Pissing Me Off” but at the time never had the issue myself. Well, shortly after her post I’ve been having the Live.com spambot hitting up a couple of my domain with one-keyword search referrals. 100% of the Live.com “traffic” has been from spoofed referrals, and it’s not stopping.
I really don’t care to contact Microsoft in regards to the issue–I’m just going to go all out and block the bot from accessing those websites. I’ll end up settings up a robots.txt file, but apparently that might not be enough as some user(s) are still experiencing issues with the bot avoiding the robots.txt file; Time to dig up the IP addresses being used so I can deny those too.
If I honestly had some real results coming from Live.com’s search, I’d be apprehensive to blog anything associated with it, but so far I’ve yet to ever receive one legitimate search referral from it. I frequent my Mint statistics more than I should on a daily basis, so I can see what keywords I need to work on.. I just can’t afford to waste my time with fake hits. Stop skewing up my shit!
I found a nice little tutorial on block bots on various levels ranging from an .htaccess redirect that sends them back to their own site, blocking their IP address, and a couple other variations. An anonymous comment from another entry, Live.com’s Referrer Spam Has Left Me In Despair provides a snippet you can add to your .htaccess file to also block the bot, given that it seems to ignore any robots.txt settings.
SetEnvIfNoCase User-Agent "^msnbot" bad_bot
<Limit GET POST>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</Limit>