Get the latest tech news

Scientists once hoarded pre-nuclear steel; now we’re hoarding pre-AI content


Newly-announced catalog collects pre-2022 sources untouched by ChatGPT and AI contamination.

"The idea is to point to sources of text, images and video that were created prior to the explosion of AI-generated content," Graham-Cumming wrote on his blog last week. That casualty was wordfreq, a Python library created by researcher Robyn Speer that tracked word frequency usage across more than 40 languages by analyzing millions of sources, including Wikipedia, movie subtitles, news articles, and social media. For example, in 2020, I proposed creating a so-called "cryptographic ark"—a timestamped archive of pre-AI media that future historians could verify as authentic, collected before my then-arbitrary cutoff date of January 1, 2022.

Get the Android app

Or read this on ArsTechnica

Read more on:

Photo of Scientists

Scientists

Photo of pre-nuclear steel

pre-nuclear steel

Photo of pre-AI

pre-AI

Related news:

News photo

Scientists Discover the Key to Axolotls’ Ability to Regenerate Limbs

News photo

Scientists create ultra-thin solar panels that are 1,000x more efficient

News photo

Scientists genetically engineer a lethal mosquito STD to combat malaria