Get the latest tech news
Writing high-performance matrix multiplication kernels for Blackwell
# In this guide, we’ll progressively iterate on a matrix multiplication kernel. The first implementation will be very simple, but also quite slow.
None
Or read this on Hacker News