Get the latest tech news
The Illustrated DeepSeek-R1
A recipe for reasoning LLMs
It is significant not because it’s a great LLM to use, but because creating it required so little labeled data alongside large-scale reinforcement learning resulting in a model that excels at solving reasoning problems. Write python code that takes a list of numbers, returns them in a sorted order, but also adds 42 at the start. If you’re new to the concept of Supervised Fine-Tuning (SFT), that is the process that presents the model with training examples in the form of prompt and correct completion.
Or read this on Hacker News