If you've never seen Webaroo before, the concept of copyright obviously has been completely glossed over.
Here's what it says on their website:
Webaroo servers crawl the web, analyze web pages and automatically select the subset of pages with the greatest diversity and quality in the least storage size. These pages are then packaged into topic-specific "Web Packs" that can be downloaded by users onto their devices. Once downloaded, users can search and browse that content on the go.Here's an English to English translation:
Webaroo takes whatever copyrighted content of yours we want and repackage it for our customers without permission. Of course we do it without permission because nobody knows about Webaroo in the first place so they won't stop us or the many bot names. Isn't it cool how we're going to steal your shit and pack it up so others can download it and now they don't even need to bother visiting your website? Wicked!Look at the total number of bot names coming from their crawler's IP address.
184.108.40.206 "WebarooBot (Webaroo Bot; http://220.127.116.11/feedback.html)"Hell, if you were trying to stop them using robots.txt it's a lost cause as the bot names seem to get changed faster than a baby's diaper.
18.104.22.168 "PiyushBot (Piyush Web Miner; http://piyush.com/feedback.html)"
22.214.171.124 "RufusBot (Rufus Web Miner; http://www.webaroo.com/rooSiteOwners.html)"
126.96.36.199 "RufusBot (Rufus Web Miner; http://188.8.131.52/feedback.html)"
184.108.40.206 "SumeetBot (Sumeet Bot; http://220.127.116.11/feedback.html)"
18.104.22.168 "PsBot (PsBot; http://22.214.171.124/feedback.html)"
126.96.36.199 "pulseBot (pulse Web Miner)"
I would just block their range of IP's, it's more convenient.
Webaroo MFN-B843-64-124-122-224-27 (NET-64-124-122-224-1)That's how you stop name changing bots, the firewall way.
188.8.131.52 - 184.108.40.206
Package THAT into a topic-specific "Web Pack" and download it.