Get the latest tech news

PyTorch 2.8 Released With Better Intel CPU Performance For LLM Inference


PyTorch 2.8 released today as the newest feature update to this widely-used machine learning library that has become a crucial piece for deep learning and other AI usage

In particular, a focus on high performance quantized large language model (LLM) inference for Intel CPUs using the native PyTorch version. "With this feature, the performance with PyTorch native stack can reach the same level or even better in some cases as comparing with popular LLM serving frameworks like vLLM when running offline mode on a single x86_64 CPU device, which enables PyTorch users to run LLM quantization with native experience and good performance." Too bad though that my AvenueCity reference server remains non-operational and thus unable to test the newest PyTorch release (and other Intel open-source improvements in recent months) on the flagship Xeon 6980P Granite Rapids processors...

Get the Android app

Or read this on Phoronix

Read more on:

Photo of Intel

Intel

Photo of PyTorch

PyTorch

Photo of LLM Inference

LLM Inference

Related news:

News photo

Linux 6.17 KVM Additions Include Intel LKGS From FRED, Smarter AMD SEV Cache Flushing

News photo

Intel's 18A process hit by low yields and quality issues, putting manufacturing comeback in doubt

News photo

Trump Ally Cotton Asks Intel’s Board About CEO’s Ties to China