Get the latest tech news

AdapTive-LeArning Speculator System (ATLAS): Faster LLM inference


LLM inference that gets faster as you use it. Our runtime-learning accelerator adapts continuously to your workload, delivering 500 TPS on DeepSeek-V3.1, a 4x speedup over baseline performance without manual tuning.

None

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Company

Company

Photo of flash attention

flash attention

Photo of Flash Attention guy

Flash Attention guy

Related news:

News photo

Dominion Voting sold to company run by ex-GOP election official

News photo

Prezent raises $30 million to acquire AI services firms — starting with founder’s other company

News photo

Intel's Lead Engineer For Linux Performance Monitoring Is Leaving The Company