Get the latest tech news
Meta defends Llama 4 release against ‘reports of mixed quality,’ blames bugs
Llama 4 continues to spread to other inference providers, but it's safe to say the initial release has not been a slam dunk.
“At this point, I highly suspect Meta bungled up something in the released weights … if not, they should lay off everyone who worked on this and then use money to acquire Nous,” commented @cto_junior on X, in reference to an independent user test showing Llama 4 Maverick’s poor performance (16%) on a benchmark known as aider polyglot, which runs a model through 225 coding tasks. Also on the r/LocalLlama subreddit, user Dr_Karminski wrote that “ I’m incredibly disappointed with Llama-4, ” and demonstrated its poor performance compared to DeepSeek’s non-reasoning V3 model on coding tasks such as simulating balls bouncing around a heptagon. Llama 4 continues to spread to other inference providers with mixed results, but it’s safe to say the initial release of the model family has not been a slam dunk with the AI community.
Or read this on Venture Beat