Get the latest tech news

Chinese Firm Trains Massive AI Model for Just $5.5 Million

Chinese AI startup DeepSeek has released what appears to be one of the most powerful open-source language models to date, trained at a cost of just $5.5 million using restricted Nvidia H800 GPUs. The 671-billion-parameter DeepSeek V3, released this week under a permissive commercial license, outperformed both open and closed-source AI models in internal benchmarks, including Meta's Llama 3.1 and OpenAI's GPT-4 on coding tasks. If the model also passes vibe checks (e.g. LLM arena rankings are ongoing, my few quick tests went well so far) it will be a highly impressive display of research and engineering under resource constraints.

Get the Android app

Or read this on Slashdot