adscanner
UA string: Mozilla/5.0 (compatible; adscanner/)/1.0 (http://seocompany.store; spider@seocompany.store)
Website: seocompany.store
FFS! Website is insecure, served over http! It serves no visible content. It appears to be connected to 'GoDaddy', the domain registrar; it tries to load javascript from godaddy.com.
AlphaBot/3.2
UA string: Mozilla/5.0 (compatible; AlphaBot/3.2; +http://alphaseobot.com/bot.html)
Website: alphaseobot.com
Doesn't ask for robots.txt, so can't possibly observe it!
Applebot/0.1
Example UA string: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Safari/605.1.15 (Applebot/0.1; +http://www.apple.com/go/applebot)
Website: www.apple.com/go/applebot
Asks for robots.txt, and appears to observe it. However, it's another bot that apparently thinks it can abuse the robots.txt protocol by claiming the right to piggyback on rules for Googlebot!
AwarioSmartBot/1.0
UA string: AwarioSmartBot/1.0 (+https://awario.com/bots.html; bots@awario.com)
Website: awario.com/bots.html
Doesn't ask for robots.txt, so can't possibly observe it!
Baiduspider-render/2.0
UA string: Mozilla/5.0 (compatible; Baiduspider-render/2.0; +http://www.baidu.com/search/spider.html)
Website: www.baidu.com/search/spider.html
Specified website does not appear to exist.
Doesn't ask for robots.txt, so can't possibly observe it!
Baiduspider/2.0
UA string: Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)
Website: www.baidu.com/search/spider.html
Specified website does not appear to exist.
Doesn't ask for robots.txt, so can't possibly observe it!
BingPreview/1.0b
UA string: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534+ (KHTML, like Gecko) BingPreview/1.0b
Website: www.bing.com/webmasters/help/which-crawlers-does-bing-use-8c184ec0
Doesn't ask for robots.txt, so can't possibly observe it!
BLEXBot/1.0
UA string: Mozilla/5.0 (compatible; BLEXBot/1.0; +http://webmeup-crawler.com/)
Website: webmeup-crawler.com
FFS! Website is insecure, served over http!
Doesn't ask for robots.txt, so can't possibly observe it!
BorneoBot/0.7.1
UA string: BorneoBot/0.7.1 (crawlcheck123@gmail.com)
Website: None
Asks for robots.txt, but fails to observe it!
coccocbot-image/1.0
UA string: Mozilla/5.0 (compatible; coccocbot-image/1.0; +http://help.coccoc.com/searchengine)
Website: http://help.coccoc.com/searchengine
Asks for robots.txt, but fails to observe it!
Cocolyzebot/1.0
UA string: Mozilla/5.0 (compatible; Cocolyzebot/1.0; https://cocolyze.com/bot)
Website: cocolyze.com/bot
Doesn't ask for robots.txt, so can't possibly observe it!
com.tinyspeck.chatlyio
UA string: com.tinyspeck.chatlyio/20.04.20 (iPhone; iOS 13.4.1; Scale/3.00)
Website: None
Doesn't ask for robots.txt, so can't possibly observe it!
DataForSeoBot/1.0
Example UA string: Mozilla/5.0 (compatible; DataForSeoBot/1.0; +https://dataforseo.com/dataforseo-bot)
Website: dataforseo.com/dataforseo-bot
Doesn't ask for robots.txt, so can't possibly observe it!
DuckDuckBot-Https/1.1
UA string: 'Mozilla/5.0 (compatible; DuckDuckBot-Https/1.1; https://duckduckgo.com/duckduckbot)'
Website: duckduckgo.com/duckduckbot
Doesn't ask for robots.txt, so can't possibly observe it!
e.ventures Investment Crawler
UA string: e.ventures Investment Crawler (eventures.vc)
Website: None.
Doesn't ask for robots.txt, so can't possibly observe it!
Expanse
UA string: Expanse, a Palo Alto Networks company, searches across the global IPv4 space multiple times per day to identify customers' presences on the Internet. If you would like to be excluded from our scans, please send IP addresses/domains to: scaninfo@paloaltonetworks.com
Website assumed to be: expanse.co
Doesn't ask for robots.txt, so can't possibly observe it!
Unwanted: I'm not one of their customers, so why are they all over my websites like a rash?
Unwanted: until it provides an explicit website address!
GarlikCrawler/1.2
UA string: GarlikCrawler/1.2 (http://garlik.com/, crawler@garlik.com)
Website: garlik.com
Doesn't ask for robots.txt, so can't possibly observe it!
Gather Analyze Provide
UA string: https://gdnplus.com:Gather Analyze Provide.
Website: gdnplus.com
Doesn't ask for robots.txt, so can't possibly observe it!
Gluten Free Crawler/1.0
UA string: Mozilla/5.0 (compatible; Gluten Free Crawler/1.0; +http://glutenfreepleasure.com/)
Website: glutenfreepleasure.com
Doesn't ask for robots.txt, so can't possibly observe it!
Googlebot-Image/1.0
UA string: Googlebot-Image/1.0
Website: None
Doesn't ask for robots.txt, so can't possibly observe it!
HealthCheckBot/0.2
UA string: HealthCheckBot/0.2
Website: None specified, though a search reveals pypi.org/project/healthcheckbot as probably relevant.
Doesn't ask for robots.txt, so can't possibly observe it!
It seems to have no business crawling third-party sites.
HTTP Banner Detection
UA string: HTTP Banner Detection (https://security.ipip.net)
Website: security.ipip.net
Website address appears invalid.
Doesn't ask for robots.txt, so can't possibly observe it!
Unwanted: until it provides a valid website address!
hypestat/1.0
UA string: Mozilla/5.0 (compatible; hypestat/1.0; +https://hypestat.com/bot)
Website: hypestat.com/bot
Doesn't ask for robots.txt, so can't possibly observe it!
Iframely/1.3.1
Example UA string: Iframely/1.3.1 (+https://iframely.com/docs/about)
Website: iframely.com/docs/about
Doesn't ask for robots.txt, so can't possibly observe it!
NB: In effect it's a bot, but uses weasel words to try to claim it's not!
Internet-structure-research-project-bot
UA string: Internet-structure-research-project-bot
Website: None
Doesn't ask for robots.txt, so can't possibly observe it!
Unwanted until it provides more information about its purpose.
iodc; odysseus
UA string: Mozilla/5.0 (iodc; odysseus 3352-131-011119113358-349; +https://iodc.co.uk)
Website: iodc.co.uk
Doesn't ask for robots.txt, so can't possibly observe it!
IonCrawl
Example UA string: IonCrawl (https://www.ionos.de/terms-gtc/faq-crawler-en/)
Website: www.ionos.de/terms-gtc/faq-crawler-en
Doesn't ask for robots.txt, so can't possibly observe it!
ips-agent
UA string: Mozilla/5.0 (compatible; ips-agent)
Website: None
Apparently used by Verisign (who run the .com and .net domain name servers) to assess traffic on domains known by them to be expiring, using this data to help sell potentially valuable 'busy' domains to bulk buyers at other registrars.
Asks for robots.txt, but fails to observe it!
Keybot
Example UA string: Keybot Translation-Search-Machine
Website: (Not delared) keybot.tools
Doesn't ask for robots.txt, so can't possibly observe it!
Unwanted: A tool for cheats and copyright bypassers/stealers.
LightspeedSystemsCrawler
UA string: LightspeedSystemsCrawler Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US
Website: None
Doesn't ask for robots.txt, so can't possibly observe it!
linkdexbot/2.0
UA string: Mozilla/5.0 (compatible; linkdexbot/2.0; +http://www.linkdex.com/bots/)
Website: www.linkdex.com/bots/
Doesn't ask for robots.txt, so can't possibly observe it!
linkdexbot/2.2
UA string: Mozilla/5.0 (compatible; linkdexbot/2.2; +http://www.linkdex.com/bots/)
Website: www.linkdex.com/bots
Asks for robots.txt, but fails to observe it!
LivelapBot/0.2
Example UA string: LivelapBot/0.2 (http://site.livelap.com/crawler)
Website: site.livelap.com/crawler
Explicitly doesn't ask for robots.txt, so can't possibly observe it!
ltx71
UA string: ltx71 - (http://ltx71.com/)
Website: ltx71.com
FFS! Website is insecure, served over http, despite the claim the purpose is "security research"!
Asks for robots.txt, but fails to observe it!
Unwanted, until the website is more informative.
Mail.RU_Bot/Robots/2.0
UA string: Mozilla/5.0 (compatible; Linux x86_64; Mail.RU_Bot/Robots/2.0; +http://go.mail.ru/help/robots)
Website: go.mail.ru/help/robots
Asks for robots.txt, but fails to observe it!
masscan 1.0
UA string: masscan 1.0 (http:www)
Website: None
Doesn't ask for robots.txt, so can't possibly observe it!
Unwanted: until it provides a valid website address!
masscan-ng/1.3
Example UA string: masscan-ng/1.3 (https://github.com/bi-zone/masscan-ng)
Website: github.com/bi-zone/masscan-ng
Website address appears invalid.
Doesn't ask for robots.txt, so can't possibly observe it!
Unwanted: until it provides a valid website address!
masscan/1.0
UA string: masscan/1.0 (https://github.com/robertdavidgraham/masscan)
Website: github.com/robertdavidgraham/masscan
Doesn't ask for robots.txt, so can't possibly observe it!
Promiscuous port scanner, more likely used for harm than good.
MegaIndex.ru/2.0
UA string: 5.9.98.178 Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)
Website: megaindex.com/crawler
Doesn't ask for robots.txt, so can't possibly observe it!
msnbot/2.0b
UA string: msnbot/2.0b (+http://search.msn.com/msnbot.htm)
Website: search.msn.com/msnbot.htm
Doesn't ask for robots.txt, so can't possibly observe it!
Neevabot/1.0
UA string: Mozilla/5.0 (compatible; Neevabot/1.0; +https://neeva.com/neevabot)
Website: neeva.com/neevabot
FFS! Tries to piggyback on rules for Googlebot! That's a whole new level of abuse of the robots.txt protocol!
NetSystemsResearch
UA string: NetSystemsResearch studies the availability of various services across the internet. Our website is netsystemsresearch.com
Website: netsystemsresearch.com
Doesn't ask for robots.txt, so can't possibly observe it!
Nimbostratus-Bot/v1.3.2
UA string: Mozilla/5.0 (compatible; Nimbostratus-Bot/v1.3.2; http://cloudsystemnetworks.com)
Website: cloudsystemnetworks.com
FFS! Website is insecure, served over http!
Doesn't ask for robots.txt, so can't possibly observe it!
Nmap Scripting Engine
UA string: Mozilla/5.0 (compatible; Nmap Scripting Engine; https://nmap.org/book/nse.html)
Website: nmap.org/book/nse.html
Doesn't ask for robots.txt, so can't possibly observe it!
nsrbot/1.0
UA string: Mozilla/5.0 (compatible; nsrbot/1.0; +http://netsystemsresearch.com)
Website: netsystemsresearch.com
FFS! Website is insecure, served over http!
Doesn't ask for robots.txt, so can't possibly observe it!
oBot/2.3.1
UA string: Mozilla/5.0 (compatible; oBot/2.3.1; http://filterdb.iss.net/crawler/)
Website: filterdb.iss.net/crawler
FFS! Website is insecure, served over http!
Asks for robots.txt, but fails to observe it!
Plukkie/1.6
UA string: Mozilla/5.0 (compatible; Plukkie/1.6; http://www.botje.com/plukkie.htm)
Website: www.botje.com/plukkie.htm
FFS! Website is insecure, served over http!
Asks for robots.txt, but fails to observe it!
probethenet
UA string: www.probethenet.com scanner
Website: www.probethenet.com
FFS! Website is insecure, served over http!
Doesn't ask for robots.txt, so can't possibly observe it!
python-requests/2.27.1 (aka Scrapy/2.6.1)
Example UA string: python-requests/2.27.1
Website: None - but see docs.scrapy.org/en/latest/topics/practices.html#bans
Appears to be an alias for Scrapy/2.6.1 (comes from same IP address)
Doesn't ask for robots.txt, so can't possibly observe it!
RepoLookoutBotBot/0.0.1
Example UA string: RepoLookoutBotBot/0.0.1 (abuse reports to abuse@crissyfield.de)
Website: None
Doesn't ask for robots.txt, so can't possibly observe it!
Riddler
UA string: Riddler (http://riddler.io/about)
Website: riddler.io/about
Asks for robots.txt, but fails to observe it!
SafeDNSBot
UA string: SafeDNSBot (https://www.safedns.com/searchbot)
Website: www.safedns.com/searchbot
Asks for robots.txt, but fails to observe it!
Scrapy/2.6.1
Example UA string: Scrapy/2.6.1 (+https://scrapy.org)
Website: scrapy.org
Doesn't ask for robots.txt, so can't possibly observe it!
Screaming Frog
UA string: Screaming Frog SEO Spider/14.1
Website: None.
Doesn't ask for robots.txt, so can't possibly observe it!
SeekportBot
UA string: 'Mozilla/5.0 (compatible; SeekportBot; +https://bot.seekport.com)
Website: bot.seekport.com
Asks for robots.txt, but fails to observe it!
Semanticbot/1.0
UA string: Mozilla/5.0 (compatible; Semanticbot/1.0; +http://sempi.tech/bot.html)
Website: sempi.tech/bot.html
FFS! Website address simply leads to page requesting contact via email.
Asks for robots.txt, but fails to observe it!
Unwanted: until it provides a valid website address.
SemanticScholarBot
UA string: Mozilla/5.0 (compatible) SemanticScholarBot (+https://www.semanticscholar.org/crawler)
Website: www.semanticscholar.org/crawler
Doesn't ask for robots.txt, so can't possibly observe it!
SEMrushBot
UA string: SEMrushBot
Website: None.
Doesn't ask for robots.txt, so can't possibly observe it!
SemrushBot-BA
UA string: Mozilla/5.0 (compatible; SemrushBot-BA; +http://www.semrush.com/bot.html)
Website: www.semrush.com/bot.html
Doesn't ask for robots.txt, so can't possibly observe it!
SemrushBot-BM/1.0
Example UA string: Mozilla/5.0 (compatible; SemrushBot-BM/1.0; +http://www.semrush.com/bot.html)
Website: www.semrush.com/bot.html
Doesn't ask for robots.txt, so can't possibly observe it!
SemrushBot/1.0~bm
UA string: Mozilla/5.0 (compatible; SemrushBot/1.0~bm; +http://www.semrush.com/bot.html)
Website: www.semrush.com/bot.html
Doesn't ask for robots.txt, so can't possibly observe it!
SemrushBot/7~bl
UA string: Mozilla/5.0 (compatible; SemrushBot/7~bl; +http://www.semrush.com/bot.html)
Website: www.semrush.com/bot.html
Asks for robots.txt, but fails to observe it!
SEOkicks-Robot
UA string: Mozilla/5.0 (compatible; SEOkicks-Robot; +http://www.seokicks.de/robot.html)
Website: www.seokicks.de/robot.html
Doesn't ask for robots.txt, so can't possibly observe it!
serpstatbot/1.0
Example UA string: serpstatbot/1.0 (advanced backlink tracking bot; curl/7.58.0; http://serpstatbot.com/; abuse@serpstatbot.com)
Website: serpstatbot.com
Doesn't ask for robots.txt, so can't possibly observe it!
serpstatbot/2.1
Example UA string: serpstatbot/2.1 (advanced backlink tracking bot; https://serpstatbot.com/; abuse@serpstatbot.com)
Website: serpstatbot.com
Asks for robots.txt, but fails to observe it!
SERPtimizerBot
Example UA string: Mozilla/5.0 (compatible; SERPtimizerBot; +http://serptimizer.com/serptimizer-bot)
Website: serptimizer.com/serptimizer-bot
Doesn't ask for robots.txt, so can't possibly observe it!
Slack-ImgProxy
UA string: Slack-ImgProxy (+https://api.slack.com/robots)
Website: api.slack.com/robots
Explicitly doesn't ask for robots.txt, so can't possibly observe it!
Slackbot 1.0
UA string: Slackbot 1.0 (+https://api.slack.com/robots)
Website: api.slack.com/robots
Explicitly doesn't ask for robots.txt, so can't possibly observe it!
Slackbot-LinkExpanding 1.0
UA string: Slackbot-LinkExpanding 1.0 (+https://api.slack.com/robots)
Website: api.slack.com/robots
Explicitly doesn't ask for robots.txt, so can't possibly observe it!
startmebot/1.0
UA string: Mozilla/5.0 (compatible; startmebot/1.0; +https://start.me/bot)
Website: start.me/bot
Explicitly doesn't ask for robots.txt, so can't possibly observe it!
SurdotlyBot/1.0
UA string: Mozilla/5.0 (compatible; SurdotlyBot/1.0; +http://sur.ly/bot.html
Website: sur.ly/bot.html
FFS! Website is insecure, served over http!
Doesn't ask for robots.txt, so can't possibly observe it!
TelegramBot
UA string: TelegramBot (like TwitterBot)
Website: None.
Doesn't ask for robots.txt, so can't possibly observe it! (So, not like Twitterbot/1.0!)
Unwanted: until it provides a valid website address!
TprAdsTxtCrawler/1.0
UA string: TprAdsTxtCrawler/1.0
Website: None
Started appearing in my server logs 19.04.2020, with HEAD request for 'ads.txt'.
Doesn't ask for robots.txt, so can't possibly observe it!
Unwanted: until it provides a source of information about itself.
TrendsmapResolver/0.1
UA string: Mozilla/5.0 (compatible; TrendsmapResolver/0.1)
Website: None.
Doesn't ask for robots.txt, so can't possibly observe it!
UA string: Mozilla/5.0 (TweetmemeBot/4.0; +http://datasift.com/bot.html) Gecko/20100101 Firefox/31.0
Website: datasift.com/bot.html
Explicitly doesn't ask for robots.txt, so can't possibly observe it!
unfurlist
UA string: unfurlist (https://github.com/Doist/unfurlist)
Website: github.com/Doist/unfurlist
Doesn't ask for robots.txt, so can't possibly observe it!
url
Example UA string: url
Website: None - but see docs.scrapy.org/en/latest/topics/practices.html#bans
Appears to be an alias for Scrapy/2.6.1 (comes from same IP address)
Doesn't ask for robots.txt, so can't possibly observe it!
Wappalyzer
Example UA string: Mozilla/5.0 (compatible; Wappalyzer)
Website: www.wappalyzer.com
Doesn't ask for robots.txt, so can't possibly observe it!
webprosbot/2.0
Example UA string: webprosbot/2.0 (+mailto:abuse-6337@webpros.com)
Website: webpros.com
Doesn't ask for robots.txt, so can't possibly observe it!
Xovibot/2.0
UA string: Mozilla/5.0 (compatible; XoviBot/2.0; +http://www.xovibot.net/
Website: www.xovibot.net/
FFS! Website is insecure, served over http!
Doesn't ask for robots.txt, so can't possibly observe it!
yacybot
Example UA string: yacybot (/global; amd64 Linux 6.1.8; java 11.0.17; America/en) http://yacy.net/bot.html
Website: yacy.net/bot.html
Asks for robots.txt, but fails to observe it!
Yahoo! Slurp
UA string: Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
Website: help.yahoo.com/help/us/ysearch/slurp
Doesn't ask for robots.txt, so can't possibly observe it!
YandexMobileBot/3.0
UA string: Mozilla/5.0 (iPhone; CPU iPhone OS 8_1 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12B411 Safari/600.1.4 (compatible; YandexMobileBot/3.0; +http://yandex.com/bots)
Website: yandex.com/bots
Doesn't ask for robots.txt, so can't possibly observe it!
ZaldomoSearchBot
Example UA string: ZaldomoSearchBot(www.zaldamo.com/search.html)
Website: www.zaldamo.com/search.html
Doesn't ask for robots.txt, so can't possibly observe it!