Get the latest tech news

Unlocking a Million Times More Data for AI


How a new ARPANET-style program could solve the data accessibility problem

Unlike web-scraped data, these datasets are continuously validated for accuracy because organizations depend on them, creating natural quality controls that make even a small fraction of this massive pool extraordinarily valuable for specialized AI applications. Together, these PETs work in concert to create structured transparency, ensuring that neither training data, model weights, nor user queries are ever exposed in unencrypted form to unauthorized parties, while cryptographic attestation maintains the attribution chains. Just as IBM, Bell Telephone, and Microsoft didn’t necessarily have the right incentives to bring together the nation’s supercomputers under the banner of TCP/IP, WWW, and HTTP, today’s AI titans are naturally focused on their individual competitive advantages rather than building shared infrastructure.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of data

data

Photo of Times

Times

Related news:

News photo

Study finds ChatGPT-5 is wrong about 1 in 4 times — here's the reason why

News photo

Google: Brickstorm malware used to steal U.S. orgs' data for over a year

News photo

These YC founders pivoted 5 times before building a social app that nabbed 300K users and over $1M ARR in 6 months