Get the latest tech news
DeepSeek releases ‘sparse attention’ model that cuts API costs in half
Researchers at DeepSeek released a new experimental model designed to have dramatically lower inference costs when used in long-context operations.
Researchers at DeepSeek on Monday released a new experimental model called V3.2-exp, designed to have dramatically lower inference costs when used in long-context operations. In DeepSeek’s case, the researchers were looking for ways to make the fundamental transformer architecture operate more efficiently — and finding that there are significant improvements to be made. The company made waves at the beginning of the year with its R1 model, trained using primarily reinforcement learning at a far lower cost than its American competitors.
Or read this on TechCrunch