Get the latest tech news

SambaNova breaks Llama 3 speed record with 1,000 tokens per second


The race to make Llama 3 faster continues as SambaNova accelerates the gen AI model to a new milestone, bringing significant benefits to enterprise users.

Today, SambaNova Systems announced that it has achieved a new milestone in terms of gen AI performance, hitting a whopping 1,000 tokens per second with the Llama 3 8B parameter instruct model. Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Liang explained that optimization was a process of balancing resource allocation between kernels to avoid bottlenecks and maximize throughput across the entire neural network pipeline.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of SambaNova

SambaNova

Photo of tokens

tokens

Photo of speed record

speed record

Related news:

News photo

Llama 3 implemented in pure NumPy

News photo

Exclusive: AI startup Tenyx’s fine-tuned open-source Llama 3 model outperforms GPT-4

News photo

Infini-Gram: Scaling unbounded n-gram language models to a trillion tokens