Get the latest tech news

DeepSeek releases ‘sparse attention’ model that cuts API costs in half


Researchers at DeepSeek released a new experimental model designed to have dramatically lower inference costs when used in long-context operations.

Researchers at DeepSeek on Monday released a new experimental model called V3.2-exp, designed to have dramatically lower inference costs when used in long-context operations. In DeepSeek’s case, the researchers were looking for ways to make the fundamental transformer architecture operate more efficiently — and finding that there are significant improvements to be made. The company made waves at the beginning of the year with its R1 model, trained using primarily reinforcement learning at a far lower cost than its American competitors.

Get the Android app

Or read this on TechCrunch

Read more on:

Photo of API

API

Photo of model

model

Photo of DeepSeek

DeepSeek

Related news:

News photo

DeepSeek Debuts ‘Sparse Attention’ Next-Generation AI Model

News photo

To digital natives, Microsoft's IT stack makes Google's look like a model of sanity

News photo

DeepSeek sets example for AI firms to have work peer reviewed, experts say