Get the latest tech news
Optimizing a Rust GPU matmul kernel
I read the excellent post [Optimizing a WebGPU Matmul Kernel for 1TFLOP+
I abstracted the CPU-side code that talks to the GPU using generics and traits so I could easily slot in different kernels and their settings while writing the blog post. I abstracted the CPU testing harness code using generics and traits so I could easily slot in different kernels and their settings while writing the blog post. Leveraging standard tools like rustfmt minimizes cognitive overhead and avoids the hassle of configuring third-party formatters of varying quality.
Or read this on Hacker News