Get the latest tech news

Llamafile 0.8 Releases With LLaMA3 & Grok Support, Faster F16 Performance

Llamafile has been quite an interesting project out of Mozilla's Ocho group in the era of AI

Llamafile builds off Llama.cpp and makes it easy to ship an entire LLM as a single file with both CPU and GPU execution support. Mixture of Experts (MoE) models like Mixtral and Grok are also now 2~5x faster for executing on CPUs after refactoring the tinyBLAS CPU code. I'll be working on new Llamafile benchmarks soon.About The AuthorMichael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience.

Get the Android app

Or read this on Phoronix