adscanner

UA string: Mozilla/5.0 (compatible; adscanner/)/1.0 (http://seocompany.store; spider@seocompany.store)

Website: seocompany.store

FFS! Website is insecure, served over http! It serves no visible content. It appears to be connected to 'GoDaddy', the domain registrar; it tries to load javascript from godaddy.com.

AlphaBot/3.2

UA string: Mozilla/5.0 (compatible; AlphaBot/3.2; +http://alphaseobot.com/bot.html)

Website: alphaseobot.com

Doesn't ask for robots.txt, so can't possibly observe it!

Applebot/0.1

Example UA string: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Safari/605.1.15 (Applebot/0.1; +http://www.apple.com/go/applebot)

Website: www.apple.com/go/applebot

Asks for robots.txt, and appears to observe it. However, it's another bot that apparently thinks it can abuse the robots.txt protocol by claiming the right to piggyback on rules for Googlebot!

AwarioSmartBot/1.0

UA string: AwarioSmartBot/1.0 (+https://awario.com/bots.html; bots@awario.com)

Website: awario.com/bots.html

Doesn't ask for robots.txt, so can't possibly observe it!

Baiduspider-render/2.0

UA string: Mozilla/5.0 (compatible; Baiduspider-render/2.0; +http://www.baidu.com/search/spider.html)

Website: www.baidu.com/search/spider.html

Specified website does not appear to exist.

Doesn't ask for robots.txt, so can't possibly observe it!

Baiduspider/2.0

UA string: Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)

Website: www.baidu.com/search/spider.html

Specified website does not appear to exist.

Doesn't ask for robots.txt, so can't possibly observe it!

BingPreview/1.0b

UA string: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534+ (KHTML, like Gecko) BingPreview/1.0b

Website: www.bing.com/webmasters/help/which-crawlers-does-bing-use-8c184ec0

Doesn't ask for robots.txt, so can't possibly observe it!

BLEXBot/1.0

UA string: Mozilla/5.0 (compatible; BLEXBot/1.0; +http://webmeup-crawler.com/)

Website: webmeup-crawler.com

FFS! Website is insecure, served over http!

Doesn't ask for robots.txt, so can't possibly observe it!

BorneoBot/0.7.1

UA string: BorneoBot/0.7.1 (crawlcheck123@gmail.com)

Website: None

Asks for robots.txt, but fails to observe it!

coccocbot-image/1.0

UA string: Mozilla/5.0 (compatible; coccocbot-image/1.0; +http://help.coccoc.com/searchengine)

Website: http://help.coccoc.com/searchengine

Asks for robots.txt, but fails to observe it!

Cocolyzebot/1.0

UA string: Mozilla/5.0 (compatible; Cocolyzebot/1.0; https://cocolyze.com/bot)

Website: cocolyze.com/bot

Doesn't ask for robots.txt, so can't possibly observe it!

com.tinyspeck.chatlyio

UA string: com.tinyspeck.chatlyio/20.04.20 (iPhone; iOS 13.4.1; Scale/3.00)

Website: None

Doesn't ask for robots.txt, so can't possibly observe it!

DataForSeoBot/1.0

Example UA string: Mozilla/5.0 (compatible; DataForSeoBot/1.0; +https://dataforseo.com/dataforseo-bot)

Website: dataforseo.com/dataforseo-bot

Doesn't ask for robots.txt, so can't possibly observe it!

DuckDuckBot-Https/1.1

UA string: 'Mozilla/5.0 (compatible; DuckDuckBot-Https/1.1; https://duckduckgo.com/duckduckbot)'

Website: duckduckgo.com/duckduckbot

Doesn't ask for robots.txt, so can't possibly observe it!

e.ventures Investment Crawler

UA string: e.ventures Investment Crawler (eventures.vc)

Website: None.

Doesn't ask for robots.txt, so can't possibly observe it!

Expanse

UA string: Expanse, a Palo Alto Networks company, searches across the global IPv4 space multiple times per day to identify customers' presences on the Internet. If you would like to be excluded from our scans, please send IP addresses/domains to: scaninfo@paloaltonetworks.com

Website assumed to be: expanse.co

Doesn't ask for robots.txt, so can't possibly observe it!

Unwanted: I'm not one of their customers, so why are they all over my websites like a rash?

Unwanted: until it provides an explicit website address!

GarlikCrawler/1.2

UA string: GarlikCrawler/1.2 (http://garlik.com/, crawler@garlik.com)

Website: garlik.com

Doesn't ask for robots.txt, so can't possibly observe it!

Gather Analyze Provide

UA string: https://gdnplus.com:Gather Analyze Provide.

Website: gdnplus.com

Doesn't ask for robots.txt, so can't possibly observe it!

Gluten Free Crawler/1.0

UA string: Mozilla/5.0 (compatible; Gluten Free Crawler/1.0; +http://glutenfreepleasure.com/)

Website: glutenfreepleasure.com

Doesn't ask for robots.txt, so can't possibly observe it!

Googlebot-Image/1.0

UA string: Googlebot-Image/1.0

Website: None

Doesn't ask for robots.txt, so can't possibly observe it!

HealthCheckBot/0.2

UA string: HealthCheckBot/0.2

Website: None specified, though a search reveals pypi.org/project/healthcheckbot as probably relevant.

Doesn't ask for robots.txt, so can't possibly observe it!

It seems to have no business crawling third-party sites.

UA string: HTTP Banner Detection (https://security.ipip.net)

Website: security.ipip.net

Website address appears invalid.

Doesn't ask for robots.txt, so can't possibly observe it!

Unwanted: until it provides a valid website address!

hypestat/1.0

UA string: Mozilla/5.0 (compatible; hypestat/1.0; +https://hypestat.com/bot)

Website: hypestat.com/bot

Doesn't ask for robots.txt, so can't possibly observe it!

Iframely/1.3.1

Example UA string: Iframely/1.3.1 (+https://iframely.com/docs/about)

Website: iframely.com/docs/about

Doesn't ask for robots.txt, so can't possibly observe it!

NB: In effect it's a bot, but uses weasel words to try to claim it's not!

Internet-structure-research-project-bot

UA string: Internet-structure-research-project-bot

Website: None

Doesn't ask for robots.txt, so can't possibly observe it!

Unwanted until it provides more information about its purpose.

iodc; odysseus

UA string: Mozilla/5.0 (iodc; odysseus 3352-131-011119113358-349; +https://iodc.co.uk)

Website: iodc.co.uk

Doesn't ask for robots.txt, so can't possibly observe it!

IonCrawl

Example UA string: IonCrawl (https://www.ionos.de/terms-gtc/faq-crawler-en/)

Website: www.ionos.de/terms-gtc/faq-crawler-en

Doesn't ask for robots.txt, so can't possibly observe it!

ips-agent

UA string: Mozilla/5.0 (compatible; ips-agent)

Website: None

Apparently used by Verisign (who run the .com and .net domain name servers) to assess traffic on domains known by them to be expiring, using this data to help sell potentially valuable 'busy' domains to bulk buyers at other registrars.

Asks for robots.txt, but fails to observe it!

Keybot

Example UA string: Keybot Translation-Search-Machine

Website: (Not delared) keybot.tools

Doesn't ask for robots.txt, so can't possibly observe it!

Unwanted: A tool for cheats and copyright bypassers/stealers.

LightspeedSystemsCrawler

UA string: LightspeedSystemsCrawler Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US

Website: None

Doesn't ask for robots.txt, so can't possibly observe it!

linkdexbot/2.0

UA string: Mozilla/5.0 (compatible; linkdexbot/2.0; +http://www.linkdex.com/bots/)

Website: www.linkdex.com/bots/

Doesn't ask for robots.txt, so can't possibly observe it!

linkdexbot/2.2

UA string: Mozilla/5.0 (compatible; linkdexbot/2.2; +http://www.linkdex.com/bots/)

Website: www.linkdex.com/bots

Asks for robots.txt, but fails to observe it!

LivelapBot/0.2

Example UA string: LivelapBot/0.2 (http://site.livelap.com/crawler)

Website: site.livelap.com/crawler

Explicitly doesn't ask for robots.txt, so can't possibly observe it!

ltx71

UA string: ltx71 - (http://ltx71.com/)

Website: ltx71.com

FFS! Website is insecure, served over http, despite the claim the purpose is "security research"!

Asks for robots.txt, but fails to observe it!

Unwanted, until the website is more informative.

Mail.RU_Bot/Robots/2.0

UA string: Mozilla/5.0 (compatible; Linux x86_64; Mail.RU_Bot/Robots/2.0; +http://go.mail.ru/help/robots)

Website: go.mail.ru/help/robots

Asks for robots.txt, but fails to observe it!

masscan 1.0

UA string: masscan 1.0 (http:www)

Website: None

Doesn't ask for robots.txt, so can't possibly observe it!

Unwanted: until it provides a valid website address!

masscan-ng/1.3

Example UA string: masscan-ng/1.3 (https://github.com/bi-zone/masscan-ng)

Website: github.com/bi-zone/masscan-ng

Website address appears invalid.

Doesn't ask for robots.txt, so can't possibly observe it!

Unwanted: until it provides a valid website address!

masscan/1.0

UA string: masscan/1.0 (https://github.com/robertdavidgraham/masscan)

Website: github.com/robertdavidgraham/masscan

Doesn't ask for robots.txt, so can't possibly observe it!

Promiscuous port scanner, more likely used for harm than good.

MegaIndex.ru/2.0

UA string: 5.9.98.178 Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)

Website: megaindex.com/crawler

Doesn't ask for robots.txt, so can't possibly observe it!

msnbot/2.0b

UA string: msnbot/2.0b (+http://search.msn.com/msnbot.htm)

Website: search.msn.com/msnbot.htm

Doesn't ask for robots.txt, so can't possibly observe it!

Neevabot/1.0

UA string: Mozilla/5.0 (compatible; Neevabot/1.0; +https://neeva.com/neevabot)

Website: neeva.com/neevabot

FFS! Tries to piggyback on rules for Googlebot! That's a whole new level of abuse of the robots.txt protocol!

NetSystemsResearch

UA string: NetSystemsResearch studies the availability of various services across the internet. Our website is netsystemsresearch.com

Website: netsystemsresearch.com

Doesn't ask for robots.txt, so can't possibly observe it!

Nimbostratus-Bot/v1.3.2

UA string: Mozilla/5.0 (compatible; Nimbostratus-Bot/v1.3.2; http://cloudsystemnetworks.com)

Website: cloudsystemnetworks.com

FFS! Website is insecure, served over http!

Doesn't ask for robots.txt, so can't possibly observe it!

Nmap Scripting Engine

UA string: Mozilla/5.0 (compatible; Nmap Scripting Engine; https://nmap.org/book/nse.html)

Website: nmap.org/book/nse.html

Doesn't ask for robots.txt, so can't possibly observe it!

nsrbot/1.0

UA string: Mozilla/5.0 (compatible; nsrbot/1.0; +http://netsystemsresearch.com)

Website: netsystemsresearch.com

FFS! Website is insecure, served over http!

Doesn't ask for robots.txt, so can't possibly observe it!

oBot/2.3.1

UA string: Mozilla/5.0 (compatible; oBot/2.3.1; http://filterdb.iss.net/crawler/)

Website: filterdb.iss.net/crawler

FFS! Website is insecure, served over http!

Asks for robots.txt, but fails to observe it!

Plukkie/1.6

UA string: Mozilla/5.0 (compatible; Plukkie/1.6; http://www.botje.com/plukkie.htm)

Website: www.botje.com/plukkie.htm

FFS! Website is insecure, served over http!

Asks for robots.txt, but fails to observe it!

probethenet

UA string: www.probethenet.com scanner

Website: www.probethenet.com

FFS! Website is insecure, served over http!

Doesn't ask for robots.txt, so can't possibly observe it!

python-requests/2.27.1 (aka Scrapy/2.6.1)

Example UA string: python-requests/2.27.1

Website: None - but see docs.scrapy.org/en/latest/topics/practices.html#bans

Appears to be an alias for Scrapy/2.6.1 (comes from same IP address)

Doesn't ask for robots.txt, so can't possibly observe it!

RepoLookoutBotBot/0.0.1

Example UA string: RepoLookoutBotBot/0.0.1 (abuse reports to abuse@crissyfield.de)

Website: None

Doesn't ask for robots.txt, so can't possibly observe it!

Riddler

UA string: Riddler (http://riddler.io/about)

Website: riddler.io/about

Asks for robots.txt, but fails to observe it!

SafeDNSBot

UA string: SafeDNSBot (https://www.safedns.com/searchbot)

Website: www.safedns.com/searchbot

Asks for robots.txt, but fails to observe it!

Scrapy/2.6.1

Example UA string: Scrapy/2.6.1 (+https://scrapy.org)

Website: scrapy.org

Doesn't ask for robots.txt, so can't possibly observe it!

Screaming Frog

UA string: Screaming Frog SEO Spider/14.1

Website: None.

Doesn't ask for robots.txt, so can't possibly observe it!

SeekportBot

UA string: 'Mozilla/5.0 (compatible; SeekportBot; +https://bot.seekport.com)

Website: bot.seekport.com

Asks for robots.txt, but fails to observe it!

Semanticbot/1.0

UA string: Mozilla/5.0 (compatible; Semanticbot/1.0; +http://sempi.tech/bot.html)

Website: sempi.tech/bot.html

FFS! Website address simply leads to page requesting contact via email.

Asks for robots.txt, but fails to observe it!

Unwanted: until it provides a valid website address.

SemanticScholarBot

UA string: Mozilla/5.0 (compatible) SemanticScholarBot (+https://www.semanticscholar.org/crawler)

Website: www.semanticscholar.org/crawler

Doesn't ask for robots.txt, so can't possibly observe it!

SEMrushBot

UA string: SEMrushBot

Website: None.

Doesn't ask for robots.txt, so can't possibly observe it!

SemrushBot-BA

UA string: Mozilla/5.0 (compatible; SemrushBot-BA; +http://www.semrush.com/bot.html)

Website: www.semrush.com/bot.html

Doesn't ask for robots.txt, so can't possibly observe it!

SemrushBot-BM/1.0

Example UA string: Mozilla/5.0 (compatible; SemrushBot-BM/1.0; +http://www.semrush.com/bot.html)

Website: www.semrush.com/bot.html

Doesn't ask for robots.txt, so can't possibly observe it!

SemrushBot/1.0~bm

UA string: Mozilla/5.0 (compatible; SemrushBot/1.0~bm; +http://www.semrush.com/bot.html)

Website: www.semrush.com/bot.html

Doesn't ask for robots.txt, so can't possibly observe it!

SemrushBot/7~bl

UA string: Mozilla/5.0 (compatible; SemrushBot/7~bl; +http://www.semrush.com/bot.html)

Website: www.semrush.com/bot.html

Asks for robots.txt, but fails to observe it!

SEOkicks-Robot

UA string: Mozilla/5.0 (compatible; SEOkicks-Robot; +http://www.seokicks.de/robot.html)

Website: www.seokicks.de/robot.html

Doesn't ask for robots.txt, so can't possibly observe it!

serpstatbot/1.0

Example UA string: serpstatbot/1.0 (advanced backlink tracking bot; curl/7.58.0; http://serpstatbot.com/; abuse@serpstatbot.com)

Website: serpstatbot.com

Doesn't ask for robots.txt, so can't possibly observe it!

serpstatbot/2.1

Example UA string: serpstatbot/2.1 (advanced backlink tracking bot; https://serpstatbot.com/; abuse@serpstatbot.com)

Website: serpstatbot.com

Asks for robots.txt, but fails to observe it!

SERPtimizerBot

Example UA string: Mozilla/5.0 (compatible; SERPtimizerBot; +http://serptimizer.com/serptimizer-bot)

Website: serptimizer.com/serptimizer-bot

Doesn't ask for robots.txt, so can't possibly observe it!

Slack-ImgProxy

UA string: Slack-ImgProxy (+https://api.slack.com/robots)

Website: api.slack.com/robots

Explicitly doesn't ask for robots.txt, so can't possibly observe it!

Slackbot 1.0

UA string: Slackbot 1.0 (+https://api.slack.com/robots)

Website: api.slack.com/robots

Explicitly doesn't ask for robots.txt, so can't possibly observe it!

Slackbot-LinkExpanding 1.0

UA string: Slackbot-LinkExpanding 1.0 (+https://api.slack.com/robots)

Website: api.slack.com/robots

Explicitly doesn't ask for robots.txt, so can't possibly observe it!

startmebot/1.0

UA string: Mozilla/5.0 (compatible; startmebot/1.0; +https://start.me/bot)

Website: start.me/bot

Explicitly doesn't ask for robots.txt, so can't possibly observe it!

SurdotlyBot/1.0

UA string: Mozilla/5.0 (compatible; SurdotlyBot/1.0; +http://sur.ly/bot.html

Website: sur.ly/bot.html

FFS! Website is insecure, served over http!

Doesn't ask for robots.txt, so can't possibly observe it!

TelegramBot

UA string: TelegramBot (like TwitterBot)

Website: None.

Doesn't ask for robots.txt, so can't possibly observe it! (So, not like Twitterbot/1.0!)

Unwanted: until it provides a valid website address!

TprAdsTxtCrawler/1.0

UA string: TprAdsTxtCrawler/1.0

Website: None

Started appearing in my server logs 19.04.2020, with HEAD request for 'ads.txt'.

Doesn't ask for robots.txt, so can't possibly observe it!

Unwanted: until it provides a source of information about itself.

TrendsmapResolver/0.1

UA string: Mozilla/5.0 (compatible; TrendsmapResolver/0.1)

Website: None.

Doesn't ask for robots.txt, so can't possibly observe it!

TweetmemeBot/4.0

UA string: Mozilla/5.0 (TweetmemeBot/4.0; +http://datasift.com/bot.html) Gecko/20100101 Firefox/31.0

Website: datasift.com/bot.html

Explicitly doesn't ask for robots.txt, so can't possibly observe it!

unfurlist

UA string: unfurlist (https://github.com/Doist/unfurlist)

Website: github.com/Doist/unfurlist

Doesn't ask for robots.txt, so can't possibly observe it!

url

Example UA string: url

Website: None - but see docs.scrapy.org/en/latest/topics/practices.html#bans

Appears to be an alias for Scrapy/2.6.1 (comes from same IP address)

Doesn't ask for robots.txt, so can't possibly observe it!

Wappalyzer

Example UA string: Mozilla/5.0 (compatible; Wappalyzer)

Website: www.wappalyzer.com

Doesn't ask for robots.txt, so can't possibly observe it!

webprosbot/2.0

Example UA string: webprosbot/2.0 (+mailto:abuse-6337@webpros.com)

Website: webpros.com

Doesn't ask for robots.txt, so can't possibly observe it!

Xovibot/2.0

UA string: Mozilla/5.0 (compatible; XoviBot/2.0; +http://www.xovibot.net/

Website: www.xovibot.net/

FFS! Website is insecure, served over http!

Doesn't ask for robots.txt, so can't possibly observe it!

yacybot

Example UA string: yacybot (/global; amd64 Linux 6.1.8; java 11.0.17; America/en) http://yacy.net/bot.html

Website: yacy.net/bot.html

Asks for robots.txt, but fails to observe it!

Yahoo! Slurp

UA string: Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)

Website: help.yahoo.com/help/us/ysearch/slurp

Doesn't ask for robots.txt, so can't possibly observe it!

YandexMobileBot/3.0

UA string: Mozilla/5.0 (iPhone; CPU iPhone OS 8_1 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12B411 Safari/600.1.4 (compatible; YandexMobileBot/3.0; +http://yandex.com/bots)

Website: yandex.com/bots

Doesn't ask for robots.txt, so can't possibly observe it!

ZaldomoSearchBot

Example UA string: ZaldomoSearchBot(www.zaldamo.com/search.html)

Website: www.zaldamo.com/search.html

Doesn't ask for robots.txt, so can't possibly observe it!