Get the latest tech news

Meta defends Llama 4 release against ‘reports of mixed quality,’ blames bugs


Llama 4 continues to spread to other inference providers, but it's safe to say the initial release has not been a slam dunk.

“At this point, I highly suspect Meta bungled up something in the released weights … if not, they should lay off everyone who worked on this and then use money to acquire Nous,” commented @cto_junior on X, in reference to an independent user test showing Llama 4 Maverick’s poor performance (16%) on a benchmark known as aider polyglot, which runs a model through 225 coding tasks. Also on the r/LocalLlama subreddit, user Dr_Karminski wrote that “ I’m incredibly disappointed with Llama-4, ” and demonstrated its poor performance compared to DeepSeek’s non-reasoning V3 model on coding tasks such as simulating balls bouncing around a heptagon. Llama 4 continues to spread to other inference providers with mixed results, but it’s safe to say the initial release of the model family has not been a slam dunk with the AI community.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of Meta

Meta

Photo of Bugs

Bugs

Photo of release

release

Related news:

News photo

Meta’s surprise Llama 4 drop exposes the gap between AI ambition and reality

News photo

Meta exec denies the company artificially boosted Llama 4’s benchmark scores

News photo

Meta ends its fact-checking program, replacing it with Community Notes | Making users the arbiters of truth in Facebook and Instagram