Get the latest tech news

Long Convolutions via Polynomial Multiplication


A tutorial on long convolutions for GPT-like models.

We’ve been writing a series of papers ( 1, 2, 3) that have at their core so-called long convolutions, with an aim towards enabling longer-context models. We worked hard to fuse this algorithm and make it efficient on modern hardware – check out FlashFFTConv for a primer there. The nice thing is – once we’ve made this connection, we can bring the tools of polynomial theory to bear on understanding these ML layers:

Get the Android app

Or read this on Hacker News