Get the latest tech news

AI crawlers need to be more respectful


We talk a bit about the AI crawler abuse we are seeing at Read the Docs, and warn that this behavior is not sustainable.

Bots repeatedly download large files hundreds of times daily, partially from bugs in their crawlers, with traffic coming from many IP addresses without rate or bandwidth limiting. Given that our Community site is only for hosting open source projects, AWS and Cloudflare do give us sponsored plans, but we only have a limited number of credits each year. But because many of these files are not downloaded often (and they're large), the cache is usually expired and the requests hit our origin servers directly, causing substantial bandwidth charges.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of AI crawlers

AI crawlers

Related news:

News photo

TechCrunch Minute: Reddit is taking a stand against AI crawlers

News photo

Reddit’s upcoming changes attempt to safeguard the platform against AI crawlers

News photo

Medium hints at a nascent media coalition to block AI crawlers