Get the latest tech news

Reddit blocks the Internet Archive from crawling its data - here's why


The social media platform is cracking down on backdoor data harvesting.

The social media platform, known as a resource where users can post anonymously and find information about virtually any subject, will block the Internet Archive's Wayback Machine from indexing its online data, according to a Monday report from The Verge. The archive is maintained in part by the Wayback Machine, a piece of web-crawling software that gathers web pages and preserves them as they appeared when they were collected, like digital flies in amber. Many of those companies have scraped training data from publicly available websites, including social media sites and news outlets, claiming legal immunity under a concept known in copyright law as fair use.

Get the Android app

Or read this on ZDNet

Read more on:

Photo of Reddit

Reddit

Photo of data

data

Photo of internet archive

internet archive

Related news:

News photo

The Dead Need Right To Delete Their Data So They Can't Be AI-ified, Lawyer Says

News photo

Reddit Will Block the Internet Archive

News photo

Reddit is restricting its availability to the Internet Archive's Wayback Machine