Get the latest tech news

Nvidia’s Llama-3.1-Minitron 4B is a small language model that punches above its weight


Nvidia researchers used model pruning and distillation to create a small language model (SLM) at a fraction of the base cost.

As tech companies race to deliver on-device AI, we are seeing a growing body of research and techniques for creating small language models(SLMs) that can run on resource-constrained devices. “Pruning and classical knowledge distillation is a highly cost-effective method to progressively obtain LLMs [large language models] of smaller size, achieving superior accuracy compared to training from scratch across all domains,” the researchers wrote. Other notable works in the field include Sakana AI’s evolutionary model-merging algorithm, which makes it possible to assemble parts of different models to combine their strengths without the need for expensive training resources.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of Nvidia

Nvidia

Photo of weight

weight

Photo of small language model

small language model

Related news:

News photo

MediaTek to Add NVIDIA G-Sync Support to Monitor Scalers, Make G-Sync Displays More Accessible

News photo

Nvidia Is Ditching Dedicated G-Sync Modules To Push Back Against FreeSync's Ubiquity

News photo

Nvidia’s AI NPCs will debut in a multiplayer mech battle game next year