Get the latest tech news

Unlocking a Million Times More Data for AI

How a new ARPANET-style program could solve the data accessibility problem

Unlike web-scraped data, these datasets are continuously validated for accuracy because organizations depend on them, creating natural quality controls that make even a small fraction of this massive pool extraordinarily valuable for specialized AI applications. Together, these PETs work in concert to create structured transparency, ensuring that neither training data, model weights, nor user queries are ever exposed in unencrypted form to unauthorized parties, while cryptographic attestation maintains the attribution chains. Just as IBM, Bell Telephone, and Microsoft didn’t necessarily have the right incentives to bring together the nation’s supercomputers under the banner of TCP/IP, WWW, and HTTP, today’s AI titans are naturally focused on their individual competitive advantages rather than building shared infrastructure.

Get the Android app

Or read this on Hacker News