Get the latest tech news
The One Billion Row Challenge in CUDA
: from 17m to 17s On my journey to learn CUDA, I decided to tackle the One Billion Row Challenge with it. The challenge is simple, but implementing it in CUDA was not.
The effort involved in setting up these buffers would essentially replicate the baseline workload, making this approach counterproductive. While still slower than a hash table’s constant time lookup, it’s much faster than linearly searching 40k+ city entries. As I finish writing this blog, I realize that my struct Stat doesn’t need to hold the city char array.
Or read this on Hacker News