Get the latest tech news

Court filings show Meta staffers discussed using copyrighted content for AI training

Court filings suggest that Meta staffers discussed using copyrighted materials to train Meta's AI models, including models in its Llama family.

For years, Meta employees have internally discussed using copyrighted works obtained through legally questionable means to train the company’s AI models, according to court documents unsealed on Thursday. “my opinion would be (in the line of ‘ask forgiveness, not for permission’): we try to acquire the books and escalate it to execs so they make the call,” wrote Xavier Martinet, a Meta research engineer, in a chat dated February 2023, according to the filings. Theakanath also outlined “mitigations” in the email intended to help reduce Meta’s legal exposure, including removing data from Libgen “clearly marked as pirated/stolen” and also simply not publicly citing usage.

Get the Android app

Or read this on TechCrunch