Get the latest tech news

The One Billion Row Challenge in CUDA


: from 17m to 17s On my journey to learn CUDA, I decided to tackle the One Billion Row Challenge with it. The challenge is simple, but implementing it in CUDA was not.

The effort involved in setting up these buffers would essentially replicate the baseline workload, making this approach counterproductive. While still slower than a hash table’s constant time lookup, it’s much faster than linearly searching 40k+ city entries. As I finish writing this blog, I realize that my struct Stat doesn’t need to hold the city char array.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of CUDA

CUDA

Photo of row challenge

row challenge

Related news:

News photo

Llm.c – LLM training in simple, pure C/CUDA

News photo

Nvidia bans using translation layers for CUDA software — previously the prohibition was only listed in the online EULA, now included in installed files