Get the latest tech news

Meta pirated books to train its AI


Meta pirated millions of books to train its AI. Search through them here.

This act, along with other information outlined and quoted here, recently became a matter of public record when some of Meta’s internal communications were unsealed as part of a copyright-infringement lawsuit brought against the company by Sarah Silverman, Junot Díaz, and other authors of books in LibGen. A Llama-team senior manager suggested fine-tuning Llama to “refuse to answer queries like: ‘reproduce the first three pages of “Harry Potter and the Sorcerer’s Stone.”’” One employee remarked that “torrenting from a corporate laptop doesn’t feel right.” Other works in LibGen include recent literature and nonfiction by prominent authors such as Sally Rooney, Percival Everett, Hua Hsu, Jonathan Haidt, and Rachel Khong, and articles from top academic journals such as Nature, Science, and The Lancet.

Get the Android app

Or read this on r/technology

Read more on:

Photo of Meta

Meta

Photo of Millions

Millions

Photo of books

books

Related news:

News photo

Meta vows to curtail false content, deepfakes ahead of Australian election via its local fact-checking program

News photo

HP avoids monetary damages over bricked printers in class-action settlement

News photo

The wildest details in the Facebook memoir Meta is trying to bury