Get the latest tech news

OpenAI Furious DeepSeek Might Have Stolen All the Data OpenAI Stole From Us


OpenAI shocked that an AI company would train on someone else's data without permission or compensation.

In its motion to dismiss in court, OpenAI wrote “it has long been clear that the non-consumptive use of copyrighted material (like large language model training) is protected by fair use.” But additionally, part of OpenAI’s argument in the New York Times case is that the only way to make a generalist large language model that performs well is by sucking up gigantic amounts of data. As Sacks mentioned, “distillation” is an established principle in artificial intelligence research, and it’s something that is done all the time to refine and improve the accuracy of smaller large language models.

Get the Android app

Or read this on r/technology

Read more on:

Photo of OpenAI

OpenAI

Photo of data openai

data openai

Photo of furious deepseek

furious deepseek

Related news:

News photo

OpenAI suddenly thinks intellectual property theft is not cool, actually, amid DeepSeek’s rise

News photo

OpenAI Furious DeepSeek Might Have Stolen All the Data OpenAI Stole from Us

News photo

Chinese firms ‘distilling’ US AI models to create rival products, warns OpenAI