Get the latest tech news

DeepSeek-V3, ultra-large open-source AI, outperforms Llama and Qwen on launch


The new model shows open-source closing in on closed-source models, suggesting reduced chances of one big AI player ruling the game.

Ultimately, DeepSeek, which started as an offshoot of Chinese quantitative hedge fund High-Flyer Capital Management, hopes these developments will pave the way for artificial general intelligence (AGI), where models will have the ability to understand or learn any intellectual task that a human being can. This approach ensures it maintains efficient training and inference — with specialized and shared “experts” (individual, smaller neural networks within the larger model) activating 37B parameters out of 671B for each token. Following this, we conducted post-training, including Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) on the base model of DeepSeek-V3, to align it with human preferences and further unlock its potential.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of Launch

Launch

Photo of Llama

Llama

Photo of source AI

source AI

Related news:

News photo

Trying out QvQ – Qwen's new visual reasoning model

News photo

Apple's First Bezel-Free iPhone Unlikely to Be Ready for 2026 Launch

News photo

Blue Origin insists New Glenn on track for launch