Get the latest tech news
I broke Meta's Llama 3.1 405B with one question (which GPT-4o mini gets right)
The failure of the large language model may point to issues with the broad use of synthetic data.
Similar to how I broke Gemini 1.5 with a query pertaining to language translation when it first became available, I was able to cause Llama 3.1 to resort to gibberish with my very first question. Messing up the conjugation of the most important verb for a language spoken by four million people seems a bit more than a corner case. That suggests that all of the extra training and computing power that has gone into the larger 405B version has the tendency to, perhaps in small cases, actually degrade the results.
Or read this on ZDNet