Get the latest tech news

vLLM large scale serving: DeepSeek 2.2k tok/s/h200 with wide-ep


Introduction

None

Get the Android app

Or read this on Hacker News

Read more on:

Photo of H200

H200

Photo of DeepSeek

DeepSeek

Photo of vLLM large scale

vLLM large scale

Related news:

News photo

DeepSeek’s conditional memory fixes silent LLM waste: GPU cycles lost to static lookups

News photo

DeepSeek to launch new AI model focused on coding in February, The Information reports

News photo

Nvidia to demand full upfront payment for H200 GPUs from China customers, report claims — more than two million chips may have been ordered despite uncertain Beijing stance