Get the latest tech news
OpenAI Furious DeepSeek Might Have Stolen All the Data OpenAI Stole from Us
OpenAI shocked that an AI company would train on someone else's data without permission or compensation.
In its motion to dismiss in court, OpenAI wrote “it has long been clear that the non-consumptive use of copyrighted material (like large language model training) is protected by fair use.” But additionally, part of OpenAI’s argument in the New York Times case is that the only way to make a generalist large language model that performs well is by sucking up gigantic amounts of data. As Sacks mentioned, “distillation” is an established principle in artificial intelligence research, and it’s something that is done all the time to refine and improve the accuracy of smaller large language models.
Or read this on Hacker News