Get the latest tech news

PyGraph: Robust Compiler Support for CUDA Graphs in PyTorch


CUDA Graphs -- a recent hardware feature introduced for NVIDIA GPUs -- aim to reduce CPU launch overhead by capturing and launching a series of GPU tasks (kernels) as a DAG. However, deploying CUDA Graphs faces several challenges today due to the static structure of a graph. It also incurs performance overhead due to data copy. In fact, we show a counter-intuitive result -- deploying CUDA Graphs hurts performance in many cases. We introduce PyGraph, a novel approach to automatically harness the power of CUDA Graphs within PyTorch2. Driven by three key observations, PyGraph embodies three novel optimizations: it enables wider deployment of CUDA Graphs, reduces GPU kernel parameter copy overheads, and selectively deploys CUDA Graphs based on a cost-benefit analysis. PyGraph seamlessly integrates with PyTorch2's compilation toolchain, enabling efficient use of CUDA Graphs without manual modifications to the code. We evaluate PyGraph across various machine learning benchmarks, demonstrating substantial performance improvements over PyTorch2.

View a PDF of the paper titled PyGraph: Robust Compiler Support for CUDA Graphs in PyTorch, by Abhishek Ghosh and 3 other authors Driven by three key observations, PyGraph embodies three novel optimizations: it enables wider deployment of CUDA Graphs, reduces GPU kernel parameter copy overheads, and selectively deploys CUDA Graphs based on a cost-benefit analysis. PyGraph seamlessly integrates with PyTorch2's compilation toolchain, enabling efficient use of CUDA Graphs without manual modifications to the code.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of PyTorch

PyTorch

Photo of cuda graphs

cuda graphs

Photo of pygraph

pygraph

Related news:

News photo

Show HN: Torch Lens Maker – Differentiable Geometric Optics in PyTorch

News photo

PyTorch 2.6 Delivers FP16 Support For x86 CPUs, Better Intel GPU Experience

News photo

Using uv with PyTorch