Get the latest tech news

Strengths and limitations of diffusion language models


Google recently released Gemini Diffusion, which is impressing everyone with its speed. Supposedly they even had to slow down the demo so people could see what…

A diffusion model will generate the whole thing, growing more accurate at each step: “xycz”, “aycd”, then “abcd”. The reason this isn’t straightforwardly quadratic is the “key-value cache”: because autoregressive models generate token-by-token, attention scores for previously-generated tokens don’t have to be checked again. But if we don’t, it’ll be because the “changing your mind” reasoning paradigm doesn’t map nicely onto block-by-block generation.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of limitations

limitations

Photo of strengths

strengths

Photo of sean

sean

Related news:

News photo

GTA 6 delay so Rockstar could 'achieve its creative vision with no limitations', says Take-Two boss

News photo

Transportation Secretary Sean Duffy Changed His Wife’s Flight to Avoid Newark Airport | Would you let your family fly out of Newark?

News photo

ChatGPT as Economics Tutor: Capabilities and Limitations