Get the latest tech news

Chinese Firm Trains Massive AI Model for Just $5.5 Million


Chinese AI startup DeepSeek has released what appears to be one of the most powerful open-source language models to date, trained at a cost of just $5.5 million using restricted Nvidia H800 GPUs. The 671-billion-parameter DeepSeek V3, released this week under a permissive commercial license, outp...

Chinese AI startup DeepSeek has released what appears to be one of the most powerful open-source language models to date, trained at a cost of just $5.5 million using restricted Nvidia H800 GPUs. The 671-billion-parameter DeepSeek V3, released this week under a permissive commercial license, outperformed both open and closed-source AI models in internal benchmarks, including Meta's Llama 3.1 and OpenAI's GPT-4 on coding tasks. If the model also passes vibe checks (e.g. LLM arena rankings are ongoing, my few quick tests went well so far) it will be a highly impressive display of research and engineering under resource constraints.

Get the Android app

Or read this on Slashdot

Read more on:

Photo of Chinese

Chinese

Photo of massive ai model

massive ai model

Photo of chinese firm trains

chinese firm trains

Related news:

News photo

ASML CEO says China is 10 to 15 years behind in chipmaking capabilities | But Chinese companies are working on EUV tools.

News photo

FCC 'Rip and Replace' Provision For Chinese Tech Tops Cyber Provisions in Defense Bill

News photo

Chinese Data Center Operator Yovole Is Said to Consider US IPO