Get the latest tech news

Diffusion models explained simply

Transformer-based large language models are relatively easy to understand. You break language down into a finite set of “tokens” (words or sub-word components…

Despite some clever tricks (mainly about how the model processes the previous tokens in the sequence), the core mechanism is relatively simple. If you want fast inference at the cost of quality, you can just run the model for less time and end up with more noise in the final output. At inference time, you start with a big block of pure-noise embeddings (presumably just random numbers) then denoise until it becomes actual decodable text.

Get the Android app

Or read this on Hacker News