Get the latest tech news
Apple shows off open AI prowess: new models outperform Mistral and Hugging Face offerings
New Apple model delivers nearly similar performance to leading open models, including Mistral-7B, Llama3 8B and Google’s Gemma
The 7B model, trained on 2.5 trillion tokens using pretraining recipes based on the OpenLM framework, comes with a 2K context window and delivers 63.7% 5-shot accuracy on MMLU. According to the researchers, this represents a 6.6 percentage point improvement on the benchmark compared to MAP-Neo — the previous state-of-the-art in the open-data language model category — while using 40% less compute for training. Just like DCLM-7B, the smaller 1.4B version of the model, trained jointly with Toyota Research Insitute on 2.6 trillion tokens, also delivers impressive performance across MMLU, Core and Extended tests.
Or read this on Venture Beat