Get the latest tech news

Are AI Web Crawlers 'Destroying Websites' In Their Hunt for Training Data?


"AI web crawlers are strip-mining the web in their perpetual hunt for ever more content to feed into their Large Language Model mills," argues Steven J. Vaughan-Nichols at the Register. And "when AI searchbots, with Meta (52% of AI searchbot traffic), Google (23%), and OpenAI (20%) leading the way...

According to Cloudflare, a major content delivery network (CDN) force, 30% of global web traffic now comes from bots. Because they're hammering websites with traffic spikes that can reach up to ten or even twenty times normal levels within minutes. As the InMotionhosting web hosting company notes, they also tend to disregard crawl delays or bandwidth-saving guidelines and extract full page text, and sometimes attempt to follow dynamic links or scripts.

Get the Android app

Or read this on Slashdot

Read more on:

Photo of Hunt

Hunt

Photo of Websites

Websites

Photo of training data

training data

Related news:

News photo

AI web crawlers are destroying websites in their never-ending hunger for any and all content

News photo

Trump appoints Airbnb co-founder to revamp public (government) websites

News photo

Websites and web developers mostly don't care about client-side problems