Get the latest tech news
I want a good parallel computer
The GPU in your computer is about 10 to 100 times more powerful than the CPU, depending on workload. For real-time graphics rendering and machine learning, you are enjoying that power, and doing those workloads on a CPU is not viable. Why aren’t we exploiting that power for other workloads? What prevents a GPU from being a more general purpose computer?
Rather, it’s the inability to organize the overall as stages operating in parallel, connected through queues tuned to use only the amount of buffer memory needed to keep everything smoothly, as opposed to the compute shader execution model of large dispatches separated by pipeline barriers. That has not come to pass, though there has been some movement in that direction: the Larrabee project (as described above), the groundbreaking cudaraster paper from 2011 implemented the traditional 3D rasterization pipeline entirely in compute, and found (simplifying quite a bit) that it was about 2x slower than using fixed function hardware, and more recent academic GPU designs based on grids of RISC-V cores. From what I’ve seen, running existing APIs on such hardware (Vulkan for compute shaders, or one of the modern variants of OpenCL) would not have significant latency advantages for synchronizing work back to the CPU, due to context switching overhead.
Or read this on Hacker News