Cloudflare Just Called Out Perplexity AI for “Stealth Crawling” Tens of Thousands of Sites

full moon in the sky

Photo by Michiel Annaert on Unsplash

Cloudflare says AI search engine Perplexity is dodging no-crawl rules on millions of daily requests. Here’s what we know.


Imagine telling someone “please don’t touch this,” and watching them sneak in through the back door anyway. That’s pretty much what Cloudflare says Perplexity AI has been doing to tens of thousands of websites—and it’s stirred up quite the storm in the tech world.

What’s going on?

On Monday, Cloudflare, the popular network security and optimization company, released a blog post accusing Perplexity AI of using “stealth bots” to sneak past website protections.

Here’s the short version: website owners often use something called a robots.txt file to tell bots and crawlers which parts of their site they don’t want indexed or scraped. This Robots Exclusion Protocol has been around since 1994, and it became an official standard in 2022. It’s kind of like robots saying “we come in peace” before running into digital territory.

But Cloudflare says Perplexity isn’t playing by those rules.

Green toy robot and its minions on a yellow couch

Photo by Phillip Glickman on Unsplash

How Perplexity allegedly avoids the blocklist

According to Cloudflare, they started getting complaints from customers who had already blocked Perplexity’s known bots—both through their robots.txt files and web application firewalls.

Yet, traffic from Perplexity kept coming.

So Cloudflare decided to dig in. Their researchers observed that if the main Perplexity bot was blocked, the company would switch to using undeclared or “stealth” crawlers. These crawlers:

  • Used multiple IP addresses that aren’t listed in Perplexity’s official IP range
  • Switched IPs regularly to avoid detection
  • Came from different Autonomous System Numbers (ASNs) to mask their origin
  • Ignored robots.txt rules

And the scale? Massive. Cloudflare says this pattern was happening across tens of thousands of domains, making millions of requests a day.

Why this matters

If these claims hold water, this isn’t just a small oversight. It undermines digital etiquette that’s been respected for 30 years. The Robots Exclusion Protocol is a cornerstone of web compatibility—a quiet contract between website owners and services that collect online data.

Violating this isn’t just rude. It’s unfair to publishers and creators who explicitly opt out of having their content copied or indexed.

This isn’t the first time Perplexity’s been in hot water

Cloudflare isn’t alone in raising a red flag.

internet neon light signage

Photo by Stephen Tauro on Unsplash

Other publishers have previously accused Perplexity of shady behaviors, including:

  • Publishing content nearly identical to a proprietary Forbes article a day after it was released, leading Forbes to call it “cynical theft.”
  • Wired reported suspicious traffic patterns from bots that seemed to belong to Perplexity and were ignoring no-crawl rules.
  • Both outlets said Perplexity manipulated bot ID strings to bypass access blocks.

In response to these practices, Cloudflare has officially de-listed Perplexity as a verified bot and added detection rules to block its stealth crawlers outright.

What does this mean moving forward?

Cloudflare put it simply: crawlers should be transparent, purposeful, and respectful—and Perplexity’s alleged behavior doesn’t cut it.

No response yet from Perplexity’s team. We’ll be watching.

But here’s the big idea: transparency still matters on the internet. And if AI companies expect to build tools using the web’s content, they need to play by the web’s rules.

No sneaking around.


🗝️ Keywords: Perplexity AI, Cloudflare, stealth bots, no-crawl, robots.txt, web scraping, AI ethics, digital content, verified bot, crawler access

📌 Want more updates like this? Follow Yugto.io for real talk on tech, data, and the people shaping the digital world.


Read more of our stuff here!

Leave a Comment

Your email address will not be published. Required fields are marked *