Get the latest tech news
Perplexity is Using Stealth, Undeclared Crawlers To Evade Website No-Crawl Directives, Cloudflare Says
AI startup Perplexity is deploying undeclared web crawlers that masquerade as regular Chrome browsers to access content from websites that have explicitly blocked its official bots, according to a Cloudflare report published Monday. When Perplexity's declared crawlers encounter robots.txt restrictio...
AI startup Perplexity is deploying undeclared web crawlers that masquerade as regular Chrome browsers to access content from websites that have explicitly blocked its official bots, according to a Cloudflare report published Monday. When Perplexity's declared crawlers encounter robots.txt restrictions or network blocks, the company switches to a generic Mozilla user agent that impersonates "Chrome/124.0.0.0 Safari/537.36" running on macOS, the web infrastructure firm reported.Cloudflare engineers tested the behavior by creating new domains with robots.txt files prohibiting all automated access. Despite the restrictions, Perplexity provided detailed information about the protected content when queried, while the stealth crawler generated 3-6 million daily requests across tens of thousands of domains.
Or read this on Slashdot