Get the latest tech news
A step-by-step guide on deploying DeepSeek-R1 671B locally
Your browser does not support the video tag. This is a (minimal) note on deploying DeepSeek R1 671B (the full version without distillation) locally with olla...
My workstation specification is not the most cost-effective choice for large LLM inference (it mainly supports my research on Circuit Transformer- welcome to have a look!). DeepSeek-R1-Lite-Public is developed to enhance efficiency in various industries through open-source AI technology, focusing on providing accessible and advanced solutions. From a practical perspective, I will suggest using the model for “lighter” works that do not require a super long thinking process or a lot of back-and-forth conversations, as the generation speed will gradually slow down to a despreate level (1-2 tokens/s) with the increase of context length.
Or read this on Hacker News