Get the latest tech news

SVDQuant+NVFP4: 4× Smaller, 3× Faster FLUX with 16-bit Quality on Blackwell GPUs


SVDQuant supports NVFP4 on NVIDIA Blackwell GPUs with 3× speedup over BF16 and better image quality than INT4. Try our interactive demo below or at https://svdquant.mit.edu/! Our code is all available at https://github.com/mit-han-lab/nunchaku.

The table above compares image quality across various datatypes on four popular text-to-image diffusion models using the MJHQ prompt set. Across all models, NVFP4 outperforms INT4, particularly in similarity metrics, thanks to the native hardware support of smaller microscaling group size on Blackwell. Notably, combining SVDQuant with NVFP4 delivers the best results, achieving a PSNR of 21.5 on FLUX.1-dev, closely matching the image quality of the original 16-bit model.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Blackwell GPUs

Blackwell GPUs

Photo of faster flux

faster flux

Photo of bit quality

bit quality

Related news:

News photo

Nvidia Launches RTX 50 Blackwell GPUs: From the $2,000 RTX 5090 To the $549 RTX

News photo

HPE goes Cray for Nvidia's Blackwell GPUs, crams 224 into a single cabinet

News photo

Nvidia reportedly delays Blackwell GPUs until 2025 over packaging issues