Get the latest tech news
SVDQuant+NVFP4: 4× Smaller, 3× Faster FLUX with 16-bit Quality on Blackwell GPUs
SVDQuant supports NVFP4 on NVIDIA Blackwell GPUs with 3× speedup over BF16 and better image quality than INT4. Try our interactive demo below or at https://svdquant.mit.edu/! Our code is all available at https://github.com/mit-han-lab/nunchaku.
The table above compares image quality across various datatypes on four popular text-to-image diffusion models using the MJHQ prompt set. Across all models, NVFP4 outperforms INT4, particularly in similarity metrics, thanks to the native hardware support of smaller microscaling group size on Blackwell. Notably, combining SVDQuant with NVFP4 delivers the best results, achieving a PSNR of 21.5 on FLUX.1-dev, closely matching the image quality of the original 16-bit model.
Or read this on Hacker News