Get the latest tech news
Accelerated Game of Life with CUDA / Triton
Let’s look at implementing Conway’s Game of Life using a graphics card. I want to experiment with different libraries and techniques, to see how to get the best performance. I’m g…
Triton is a custom programming language that both simplifies writing GPU kernels, and can handle a lot of performance enhancements that are pretty tedious to implement manually in CUDA. When starting this project, I wanted to experiment with 1 byte per cell as that is closer to a typical ML kernel which uses quantized floats. As we’re limited by memory bandwidth, shrinking the storage by a factor of 8 will completely smash any incremental improvements we can make on the kernel.
Or read this on Hacker News