Get the latest tech news

Less is more: How ‘chain of draft’ could cut AI costs by 90% while improving performance


Zoom researchers unveil "chain of draft," which cuts AI token usage by 92%, transforming the economics of language model deployment.

“When solving complex tasks — whether mathematical problems, drafting essays or coding — we often jot down only the critical pieces of information that help us progress,” the researchers explain. As companies increasingly integrate sophisticated AI systems into their operations, computational costs and response times have emerged as significant barriers to widespread adoption. The technique could prove especially valuable for latency-sensitive applications like real-time customer support, mobile AI, educational tools and financial services, where even small delays can significantly impact user experience.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of Performance

Performance

Photo of Draft

Draft

Photo of chain

chain

Related news:

News photo

DeepSeek Develops Linux File-System For Better AI Training & Inference Performance

News photo

RADV Driver Expands Use Of Performance-Helping DCC Fast Clears On RDNA3 GPUs

News photo

AMD EPYC Turin Power Profile Selection Impact On Performance & Efficiency